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PREFACE 


This  report  to  the  Joint  Services  Technical  Advisory  Committee  presents  a 
summary  of  the  research  programs  in  the  broad  fibld  of  electronics  conducted  during 
the  past  year  at  the  Polytechnic  Institute  of  New  York.  These  programs  are  pursued 
within  the  framework  of  the  Microwave  Research  Institute,  and  they  involve  the  aca- 
demic research  activities  of  faculty  in  the  departments  of  Electrical  Engineering  and 
Electrophysics,  Physics,  and  Chemistry.  The  research  projects  cover  a broad  spec- 
trum ranging  from  basic  theoretical  investigations  in  physics,  applied  mathematics, 
and  engineering  to  experimental  efforts  involving  basic  measurements  and  the  develop- 
ment of  devices  and  materials. 

The  format  of  this  annual  report  permits  a coherent  presentation  of  the  various 
phases  of  the  Joint  Services  Electronics  Program  (JSEP)  at  the  Polytechnic  and  their 
relation  to  ongoing  research  in  electronics  sponsored  by  other  agencies.  This  presen- 
tation is  intended  for  the  information  of  the  Air  Force  Office  of  Scientific  Research,  the 
Army  Research  Office  and  the  Office  of  Naval  Research  and,  in  addition,  the  other 
sponsors  who  are  individually  acknowledged  throughout  the  report.  The  principal  aims 
of  the  JSEP  are  to  initiate  deserving  lines  of  research  in  a timely  fashion  and  to  develop 
investigations  to  a stature  sufficient  to  attract  individual  support  on  their  own  merits. 

In  the  early  days  of  the  Microwave  Research  Institute,  the  research  program 
consisted  primarily  of  projects  involving  electromagnetics  and  microwave  components. 
Although  the  name  of  the  Institute  has  remained  the  same,  the  nature  of  the  research 
programs  has  broadened  substantially,  and  the  programs  now  encompass  a wide  range 
of  topics  within  the  field  of  electronics.  The  current  programs  are  organized  into 
twelve  areas:  electromagnetics;  acoustics;  optics;  quantum  electronics;  solid  state  and 
materials;  wave-matter  interactions;  electric  power  engineering;  communications; 
computers  and  computer -communication  networks;  safety,  reliability  and  software 
engineering;  systems,  control  and  networks;  and  data  processing.  A short  description 
of  the  nature  of  these  programs  is  presented  in  the  Introduction,  on  pages  xi  through 
xxxii. 


Arthur  A.  Oliner 
Director 
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THE  MICROWAVE  RESEARCH  INSTITUTE  PROGRAMS 

of  the 

POLYTECHNIC  INSTITUTE  OF  NEW  YORK 

(1975  - 197  6) 


This  introductory  section  contains  an  over-all  summary  of  the  research 
programs  in  the  broad  field  of  electronics  conducted  at  the  Polytechnic  Institute  of 
New  York  during  the  past  year.  These  programs  are  organized  below  under  the  fol- 
lowing descriptive  subject  headings; 

ELECTROPHYSICS:  Electromagnetics;  Acoustics;  Optics;  Quantum  Elec- 
tronics; Solid  State  and  Materials;  Wave-Matter  Interactions;  and 
Electric  Power  Engineering. 

SYSTEMS:  Communications;  Computers  and  Computer- Communication 

Networks;  Safety,  Reliability  and  Software  Engineering;  Systems, 
Control  and  Networks;  and  Data  Processing. 


I.  ELECTROPHYSICS 

A.  ELECTROMAGNETICS 
Program  Director:  A,  Hessel 


The  various  investigations  which  fall  under  the  broad  heading  of  electromag- 
netics involve  the  propagation,  guiding,  radiation,  and  diffraction  of  electromagnetic 
waves  in  a large  variety  of  environments.  Included  under  this  category  during  the  past 
few  years  are  major  programs  in  wave  types  near  interfaces,  radiation  from  and 
scattering  from  periodic  structures,  various  antenna  investigations,  particularly  those 
involving  planar  and  conformal  phased  arrays,  radiation  from  sources  and  scattering  by 
obstacles  in  media  with  relatively  arbitrary  properties,  a systematic  exploitation  of 
ray-optical  techniques  in  p.-opagation,  guiding,  scattering  and  antenna  problems,  and 
topics  related  to  biological  hazards  at  microwave  frequencies. 

The  program  on  wave  types  near  interfaces  some  years  ago  introduced  the 
concept  of  leaky  waves,  showed  its  value  in  the  explanation  of  many  radiation  phenomena, 
and  laid  the  foundations  for  a new  class  of  traveling-wave  antennas,  the  leaky-wave 
antenna,  which  permitted  better  agreement  between  theoretical  and  measured  radiation 
patterns  than  any  other  type  of  antenna.  The  program  led  to  a very  general  study  of 
wave  types,  including  several  categories  of  complex  guided  wave  and  lateral  wave,  and 
showed  their  interrelations.  The  lateral  wave  was  also  shown  to  be  the  mechanism 
which  permits  most  of  the  point-to-point  communication  in  a jungle  or  forest  environment. 

A current  investigation  which  relates  to  wave  types  near  interfaces  involves 
the  lateral  beam  shift  encountered  by  a beam  of  finite  width  when  incident  upon  an  inter- 
face under  appropriate  conditions.  If  the  beam  is  incident  upon  the  interface  between 
two  dielectric  half-spaces  at  or  very  near  to  the  critical  angle  of  total  reflection,  the 
resulting  beam  shift  is  called  the  Goos-HSnchen  shift.  This  shift  is  a special  case  of 
a larger  class  which  is  currently  under  systematic  investigation.  At  the  critical  angle 
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of  total  reflection,  the  incident  beam  couples  to  a lateral  wave  which  propagates  along 
the  interface.  If  a layer  is  present  on  the  interface,  or  if  a periodic  grating  is  located 
there,  and  if  the  beam  is  incident  at  the  angle  corresponding  to  a leaky  wave  on  the 
layer  or  grating,  a very  pronounced  lateral  beam  shift  is  found.  This  beam  shift,  which 
is  related  to  a pole  in  the  wavenumber  spectral  plane,  is  much  larger  in  general  than 
the  beam  shift  due  to  the  lateral  wave,  which  corresponds  only  to  a branch  point.  This 
beam-shift  mechanism  has  recently  been  used  to  achieve  high  coupling  efficiencies  at 
optical  frequencies  between  laser  beams  and  thin  films.  The  beam  shift  itself  offers 
a simple  condition  for  optimizing  the  coupling  parameters.  The  general  properties  of 
such  beam  shifts  and  their  application  to  optical  coupling  devices  are  under  systematic 
study. 


The  lateral  beam  shift  is  also  required  for  a correct  ray-optical  model  for 
modal  propagation  in  thin  films  and  optical  fibers.  Such  modal  propagation  studies, 
which  also  involve  the  excitation  of  guiding  structures  of  this  type  by  Gaussian  beams, 
and  the  optical  coupling  studies  mentioned  above,  are  considered  in  more  detail  in  the 
Optics  section.  The  leaky-wave  beam  shift  is  also  shown  to  permit  the  optimization  of 
the  coupling  efficiency  of  acoustic  wedge  transducers  and  couplers  from  acoustic  sur- 
face waves  to  linear  acoustic  waveguides;  these  studies  are  reported  in  the  Acoustics 
section. 


The  program  on  periodic  structures  has  many  ramifications.  One  phase  re- 
lates to  general  studies  of  radiation  from  surfaces  or  interfaces  which  are  modulated 
periodically,  and  in  this  context  the  Brillouin  diagram  for  radiating  periodic  structures 
was  first  introduced  and  its  usefulness  clarified.  This  phase  also  included  studies  of 
various  types  of  periodically-modulated  slow-wave  antennas.  Another  phase  involved 
the  study  of  special  symmetries  in  periodic  structures,  particularly  screw  and  glide 
symmetry.  Studies  of  scattering  by  periodic  surfaces  led  to  a new  and  physically- 
satisfying  theory  of  Wood's  anomalies  on  optical  gratings,  an  effect  which  remained 
poorly  understood  for  60  years.  This  theory  shed  substantial  light  on  scattering  reso- 
nances in  general,  and  these  ideas  have  been  subsequently  applied  to  other  areas,  such 
as  scattering  from  plasmas  and  radiation  from  phased-array  antennas.  These  studies 
are  now  being  extended  to  the  investigation  of  effects  due  to  the  boundedness  of  finite 
beams. 


Another  study  relating  to  periodic  structures  led  to  a new  way  of  treating 
mutual-coupling  effects  in  phased  arrays  which  takes  these  effects  into  account  rigor- 
ously and  automatically,  and  it  proposed  the  first  compensation  scheme  for  the  minimi- 
zation of  such  effects.  A recent  study  extended  this  technique  to  the  explanation  of 
resonance  effects  in  large  phased  arrays,  a phenomenon  that  can  cause  the  array  to 
become  unexpectedly  "blind"  at  certain  scan  angles.  This  theoretical  and  experimental 
study  achieved  a rather  complete  understanding  of  this  phenomenon- -its  causes  and  how 
it  may  be  avoided.  In  addition,  a guided-wave  approach  has  been  applied  specifically 
to  the  analysis  of  a slot-fed  phased  array  on  a conducting  circular  cylinder.  This 
analysis  is  the  first  one  for  such  a structure  in  which  mutual-coupling  effects  have  been 
completely  taken  into  account. 

A concentrated  effort  is  under  way  to  establish  methods  of  analysis  for 
mutually- coupled  arrays  of  aperture  elements  on  non-planar  conducting  convex  and 
concave  surfaces,  using  both  modal  ana  asymptotic  (ray)  methods.  Non-planar  arrays 
of  this  type  are  called  "conformal  arrays"  because  they  are  often  located  on  the  skins 
of  aircraft,  where  they  must  "conform"  to  the  shape  of  the  structure  (wing,  fuselage, 
etc.  ) on  which  they  are  located.  The  motivation  for  this  effort  is  twofold:  a)  for 
application  to  a recent  interesting  development  in  antenna  arrays  called  the  "Dome 
Antenna",  which  is  a feed-array  combination  of  a single  planar  phased  array  with  a 
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passive,  dome-shaped,  feed-through  lens,  and  b)  for  the  design  of  flush-mounted  slot 
arrays  on  conical  or  ogive  surfaces  for  homing  missile  applications. 

While  the  design  of  conformal  arrays  generally  requires  the  knowledge  of  the 
coupling  coefficients  and  element  patterns  in  a mutually-coupled  convex  environment, 
a conformal  feed-through  scanning  lens  array  design  necessitates,  in  addition,  the 
knowledge  of  mutual  coupling  for  arrays  on  concave  conducting  surfaces.  The  following 
topics  are  currently  being  pursued;  (1)  the  analysis  of  dually-polarized  open-ended 
circular  waveguide  elements,  and  (2)  the  coupling  coefficients  in  cylindrical  arrays  of 
waveguide-fed  axial  slits. 

Analysis  shows  that  the  decay  of  the  coupling  coefficients  in  arrays  on  concave 
surfaces  may  differ  radically  from  that  on  planar  arrays.  The  decay  rate  may  be  con- 
siderably slower  (even  for  ka~100,  where  a is  the  local  equivalent  cylindrical  radius). 
This  feature  indicates  that  the  use  of  a planar  array  matching  network  may  not  be 
adequate,  and  that  the  size  of  the  test  array  used  for  the  measurement  of  coupling  co- 
efficients on  concave  surfaces  should  be  larger  than  that  used  in  the  planar  case. 

The  design  of  homing  missile  slot  arrays  requires  the  knowledge  of  coupling 
coefficients  and  element  patterns  on  conical  or  ogive  conducting  surfaces.  A computer 
program  based  on  the  harmonic  series  solution  for  a conical  geometry  is  extremely 
time  consuming.  To  alleviate  this  difficulty,  analytical  expressions  based  on  principles 
of  the  Geometric  Theory  of  Diffraction  (GTD)  have  been  obtained  which  include  modifi- 
cations to  torsional  geometry  of  the  existing  GTD  formulas  valid  for  the  torsionless  case. 
Based  on  these  expressions,  a computer  program  has  been  developed  for  the  calculation 
of  the  GTD  radial  or  circumferential  slot  far-field  patterns  on  conical  surfaces.  The 
numerical  results  compare  well  with  those  based  on  the  harmonic  series,  except  in  the 
forward  axial  (tip)  region,  and  in  regions  of  low  intensity  in  the  GTD  expressions, 
where  the  tip  scattering  that  was  neglected  in  the  GTD  analysis  is  of  importance. 

For  the  computation  of  mutual  coupling,  GTD  expressions  were  derived  for  the  surface 
current  due  to  a magnetic  current  element  located  on  a conical  surface,  based  on  the 
asymptotic  solution  of  the  canonical  problem  of  a circular  cylinder.  For  axial  slots, 
the  mutual  admittance  values  calculated  via  the  GTD  expressions  for  a circular  cylin- 
drical surface  for  ka~10  check  in  the  deep  shadow  and  for  the  near  neighbors  with 
harmonic  series  results  found  in  the  literature.  For  circumferential  slots,  the  asymp- 
totic expressions  must  include  additional  truncation  functions. 

The  current  program  includes  the  following  topics:  the  analysis  of  spiral 
elements  for  phased  array  applications;  the  performance  of  a two-dimensional  scanned 
lens  array;  modal  analyses  of  mutual  coupling  on  concave  spherical  surfaces;  ray 
methods  for  two-  and  three-dimensional  concave  surfaces;  and  the  synthesis  of  wide- 
angle  scanning  lens  arrays. 

The  program  on  propagation  and  scattering  in  media  with  arbitrary  properties 
was  motivated  by  the  recognition  that  recent  technology  requires  a knowledge  of  the 
propagation  of  electromagnetic  waves  in  physical  environments  of  increased  complexity, 
and  of  the  radiation  characteristics  of  antennas  and  the  scattering  characteristics  of 
obstacles  in  these  environments.  Examples  of  such  media  are  provided  by  ionized 
plasmas,  either  in  the  laboratory  or  in  outer  space,  whose  electrical  properties  can 
be  described  macroscopically  in  terms  of  an  inhomogeneous  or  homogeneous  isotropic 
dielectric,  an  anisotropic  dielectric,  or  a mechanically  deformable  dielectric  material, 
the  choice  of  a particular  model  being  dependent  on  the  circumstances  in  question.  In 
addition,  turbulent  processes  in  the  medium  may  require  the  inclusion  of  statistical 
properties  in  its  description.  When  a plasma  medium  surrounds  an  antenna  or  a scat- 
tering object,  a situation  encountered  when  a rocket  or  satellite  passes  through  the 
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ionosphere  or  when  a high  speed  vehicle  re-enters  the  upper  atmosphere,  the  above- 
mentioned  processes  of  radiation  and  scattering  become  relevant  for  problems  of 
radio  communication  and  detection.  They  are  also  relevant  for  optical  detection  of 
satellites  and  other  objects  since  the  optical  signal  traverses  an  inhomogeneous  and 
turbulent  atmosphere  on  its  way  to  the  ground-based  detector. 

In  these  studies,  ray-optical  concepts  play  an  important  role.  The  explora- 
tion of  the  range  of  validity  of  ray-optical  procedures  in  propagation,  guiding,  radiation 
and  scattering  in  a general  environment,  and  the  subsequent  application  of  these  tech- 
niques to  problems  of  current  interest,  provide  the  basis  for  much  of  this  quasi-optic 
phase  of  the  electromagnetics  program. 

Particular  attention  in  the  quasi-optic  program  has  been  given  to  the  following 
subject  areas:  1)  propagation  of  pulses  in  a dispersive  environment;  2)  scattering  by 

discontinuities  in  homogeneously  and  inhomogeneously  filled  waveguides  or  ducts; 

3)  propagation  and  scattering  of  exponentially-decaying  (evanescent)  fields  in  lossless 
or  lossy  regions  by  complex  ray  techniques;  4)  inhomogeneous  wave  tracking.  To  deal 
with  the  first  category,  an  asymptotic  theory  based  on  space-time  rays  (a  generaliza- 
tion of  geometric-optical  rays)  and  plane-wave  dispersion  surfaces  has  been  developed. 
Complicated  transient  wave  processes  are  thereby  described  in  terms  of  the  motion, 
evolution,  and  interaction  of  wave  packets.  The  theory  has  been  applied  to  spatially- 
varying,  temporally- varying  and  moving  plasma  media,  and  the  results  obtained  are 
relevant  for  the  study  of  pulse  degradation,  pulse  compression  and  related  phenomena. 

In  the  second  category,  novel  applications  of  ray-optical  (i.  e.  , high-frequency) 
methods  have  been  found  to  yield  remarkably  accurate  results  for  radiation  from,  and 
scattering  by,  discontinuity  elements  in  guiding  structures,  even  in  the  relatively -low- 
frequency  propagation  range  of  only  the  dominant  mode.  The  technique  has  recently 
been  applied  to  VLF  mode  excitation  and  conversion  in  the  earth-ionosphere  waveguide, 
and  has  now  been  extended  to  multiwave  media  such  as  compressible  plasmas  or  elastic 
solids.  Moreover,  by  viewing  stable  and  unstable  optical  resonators  as  open-ended 
waveguides  along  the  coordinate  transverse  to  the  mirror  axis,  a novel  and  promising 
analytical  method  for  determining  losses  and  field  configurations  in  open  resonators  is 
presently  under  study.  The  method  has  so  far  been  remarkably  successful  in  explaining, 
by  a simple  model,  the  complicated  loss  behavior  of  eigenmodes  in  unstable  resonators, 
which  are  widely  employed  with  high-power  lasers.  This  study  is  discussed  further  in 
the  Optics  section. 

Emphasis  in  the  third  category  is  placed  on  determining  the  local  propagation 
properties  of  evanescent  fields  for  the  purpose  of  developing  therefrom  a theory  of 
propagation  and  scattering  analogous  to  the  geometrical  theory  of  diffraction  for  non- 
evanescent  fields.  While  evanescent  fields  can  be  regarded  as  traveling  along  ray 
trajectories  in  complex  space,  the  investigation  seeks  to  clarify  the  physical  signifi- 
cance of  "complex  rays"  and  their  possible  utility  in  constructing  the  fields.  Results 
obtained  are  of  interest  for  communication  with,  or  detection  of,  objects  located  in  the 
refraction  shadow  region  of  an  antenna  illuminating  an  inhomogeneous  lossless  environ- 
ment, and  for  similar  problems  involving  lossy  media;  for  gap  coupling  to  totally- 
reflected  optical  beams;  and  for  calculation  of  diffraction  losses  in  open  optical 
resonators. 

Alternative  to  the  complex  ray  approach  is  a new  theory,  which  provides  a 
means  of  tracking  inhomogeneous  wave  fields  in  real  coordinate  space.  Examples  of 
inhomogeneous  wave  fields  are  evanescent  fields,  leaky  waves  and  Gaussian  beams. 

The  advantage  of  this  theory  is  that  it  provides  direct  information  on  phase  front  distor- 
tion and  amplitude  changes  as  the  wave  or  beam  penetrates  a medium  and  is  reflected. 
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refracted  or  scattered;  in  particular,  the  theory  accom-nodates  scattering  events  that 
lead  to  substantial  field  distortions  as,  for  example,  scattering  of  a Gaussian  beam  by 
an  obstacle  whose  size  is  comparable  to  the  beam  width.  Results  have  been  obtained 
for  various  evanescent  wave  and  beam  scattering  problems,  and  further  studies  are  in 
progress.  A very  recent  development  involves  application  of  evanescent  wave  tracking 
to  modal  propagation  in  graded-index  optical  films  and  fibers.  This  entirely  new 
analysis  promises  to  render  tractable  a broader  class  of  index  variations  than  can  be 
accommodated  by  presently  used  modal  or  ray-optical  techniques. 

It  has  long  been  known  that  there  are  various  biological  hazards  attendant  on 
the  use  of  high  microwave  power.  The  human  is  particularly  sensitive  in  this  respect. 
There  exists,  therefore,  strong  motivation  for  the  establishment  of  tolerance  limits 
and,  ultimately,  safety  standards.  A study  of  the  effects  of  microwave  radiation  on 
I the  eye,  employing  rabbits  as  subjects,  is  continuing. 


f 
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B.  ACOUSTICS 

Program  Director:  H.  L,  Bertoni 

A major  program  that  has  been  underway  for  several  years  deals  with  guided 
acoustic  waves  propagating  on  the  surfaces  of,  or  at  interfaces  between,  elastic  solids. 
Because  of  the  extremely  low  velocity  of  elastic  waves,  as  compared  to  electromagnetic 
waves,  and  their  correspondingly  small  wavelength,  these  waves  have  found  application 
in  a variety  of  miniaturized  signal  processing  devices,  whose  electromagnetic  counter- 
parts would  be  cumbersome,  or  not  at  all  feasible.  Such  devices  include  delay  lines, 
pulse  compression  filters,  band  pass  filters,  high  frequency  resonators  and,  in  conjunc- 
tion with  semiconductors,  convolvers,  correlators  and  storage  elements.  These  devices 
involve  acoustic  wave  interactions  and  are  therefore  acoustic  in  nature,  but  they  are 
fitted  with  input  and  output  electro-acoustic  transducers  and  are  employed  in  electronic 
systems. 

Motivated  by  these  applications,  the  program  has  sought  to  explore  device 
applications  as  well  as  the  basic  wave  scattering  and  coupling  phenomena  that  occur  in 
such  devices.  Many  of  these  scattering  and  coupling  phenomena  also  arise  in  methods 
which  are  used  for  the  nondestructive  testing  of  solid  materials  for  flaws,  cracks,  etc. 
Our  program  is  currently  exploring  the  ways  in  which  known  methods  of  nondestructive 
testing  or  evaluation  can  be  made  more  quantitatively  accurate,  and  also  devising  new 
methods. 


In  order  to  study  acoustic  wave  scattering  and  coupling  phenomena  in  a sys- 
tematic manner,  we  introduced  and  developed  a microwave  network  and  transmission 
line  formalism  for  acoustic  waves  in  isotropic  solids.  Using  this  formulation, 
transmission-line  representations  have  been  derived  for  bulk  acoustic  waves,  and 
equivalent  networks  were  obtained  for  several  types  of  planar  interface.  These  net- 
works in  turn  were  used  to  derive  the  characteristics  of  waves  guided  by  a variety  of 
planar  structures,  such  as  free  or  welded  plates  and  plated  surfaces. 

The  transmission  line  and  network  approach  also  forms  the  basis  for  a sys- 
tematic investigation  of  the  properties  of  several  basic  waveguides  for  acoustic  surface 
waves.  These  waveguides  are  linear  (as  opposed  to  planar)  configurations  which  serve 
to  laterally  confine  the  acoustic  fields;  the  structures  under  detailed  examination  in  the 
program  include  the  strip  and  slot  waveguides,  which  are  flat  overlay  guides,  and  the 
rectangular  ridge  topographic  and  tall  overlay  waveguides.  The  studies  are  primarily 
theoretical,  but  some  measurements  are  also  being  taken.  The  analytical  results  for 
the  strip  and  slot  waveguides  are  the  most  accurate  ones  available,  and  the  analyses 
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for  the  rectangular  ridge  structures  form  the  only  analytical  theories  to  date  for  those 
guides,  although  very  accurate  numerical  results  have  been  published  for  the  topo- 
graphic structure.  These  analytical  expressions,  furthermore,  agree  well  with  both 
our  own  measurements  and  those  of  others.  We  are  also  studying  the  properties  of 
the  ridge  structure  of  semi-infinite  height,  which  can  guide  waves  along  its  top  edge, 
in  order  to  better  understand  the  behavior  of  ridge  waveguides  of  finite  height. 

The  transmission  line  and  network  approach  is  not  only  useful  in  the  analysis 
of  waveguides,  but  it  has  also  permitted  quantitative  design  in  a novel  method  for  the  { 

efficient  excitation  of  these  linear  waveguides.  This  method  is  a two-dimensional  ' 

version  of  the  prism  coupler  used  in  integrated  optics  for  coupling  a laser  beam  to  a 
surface  wave  on  a thin  film.  In  its  application  to  the  excitation  of  the  acoustic  strip  | 

waveguide,  an  efficiency  of  65%  over  a 25%  band  was  achieved  without  any  cut  and  try. 

Novel  components  for  acoustic  surface  waves  are  also  being  devised,  and  they 
are  being  analyzed  with  the  help  of  the  networks  mentioned  above.  Among  these  com- 
ponents are  simple  filters,  bulk  wave  suppressors,  power  splitters,  nonreflecting 
delay-line  taps,  and  resonant  cavities,  all  based  on  either  total  reflection  or  partial 
reflection  produced  by  a narrow  strip  of  plated  material.  These  components  have  the 
added  advantage  that  they  can  be  used  on  any  substrate,  piezoelectric  or  not.  The 
effect  of  a beam  of  finite  width  on  the  performance  of  these  com.ponents  is  presently 
under  theoretical  study.  A transmission  line  representation  for  bulk  waves  in  piezo- 
electric crystals  has  also  been  developed,  and  it  was  used  to  treat  a novel  class  of 
bulk  wave  filters. 

An  important  class  of  surface  acoustic  wave  devices  for  signal  processing 
employs  nonlinear  space  charge  effects  in  semiconductors  adjacent  to  piezoelectric 
materials.  Studies  being  performed  here  relating  to  this  class  of  devices  include 
convolvers,  an  acoustic  phase-locked  loop,  correlators,  time  compressors  and  storage 
devices.  The  use  of  semiconductors  in  proximity  to  a piezoelectric  substrate  was  first 
proposed  at  the  Polytechnic;  subsequent  work  here  has  added  to  both  the  theoretical  and 
experimental  development  of  such  devices.  Besides  the  original  work  on  convolvers, 
the  acoustic  phase-locked  loop  was  invented  here,  and  a fully  integrated  version  is  now 
under  development.  Real  time  correlation  and  time  compression  were  also  obtained 
here,  and  these  devices  are  currently  being  refined,  together  with  a storage  correlator. 

The  first  complete  theoretical  analysis  has  been  carried  out  of  the  phenomena 
associated  with  the  reflection  of  a bounded  acoustic  beam  from  the  surface  of  a solid 
at  the  angle  for  phase  matching  with  the  surface  wave.  The  primary  effect  is  a large 
lateral  shift  of  the  reflected  beam  from  the  position  predicted  by  geometrical  optics. 

Experimental  observations  carried  out  elsewhere  for  the  case  of  a beam  incident  from 
a fluid  are  in  complete  agreement  with  the  theory.  Subsequent  studies  for  a beam 
incident  from  a second  solid  have  led  to  a complete  theoretical  understanding  of  the 
performance  of  wedge  transducers  for  exciting  surface  waves;  this  theory  has  shown, 
for  the  first  time,  how  high  efficiencies  can  be  achieved  for  such  a device.  These 
phenomena  also  offer  a new  method  for  the  nondestructive  evaluation  of  the  surface 
properties  of  a solid,  and  their  use  is  currently  under  investigation. 

In  addition,  we  are  studying  the  propagation  of  waves  resulting  from  the  coupling 
of  magnetic  moments  in  a ferrite  material  to  the  elastic  stress.  The  dispersion  and 
polarization  characteristics  of  such  magnetoelastic  waves  have  been  found  for  both  bulk 
and  surface  waves. 
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C.  OPTICS 

Program  Director:  L,  Bergstein 


Great  progress  has  been  made  recently  in  broadening  the  research  program  in 
optics.  The  theoretical  studies  have  been  extended  and  the  experimental  effort  has  been 
expanded  to  include  the  fabrication  and  measurement  of  integrated  optics  structures  and 
related  thin-film  devices.  The  present  program  consists  of  the  following  topics:  the 
development  of  a new  narrow-band  matched  optical  filter  (theoretical  and  experimental), 
research  in  integrated  optics  and  related  thin-film  devices  (theoretical  and  experimental), 
and  the  investigation  of  unstable  optical  resonators  (theoretical  only).  The  topic  of 
integrated  optics  is  the  one  which  is  expanding  most  rapidly,  and  it  currently  consists 
of  several  different  projects. 

Development  is  progressing  on  a class  of  narrow-band  optical  passband  and/or 
rejection  filters  based  on  selective  reflection  from  atomic  vapors.  These  filters,  which 
can  operate  in  the  580  to  8550A  range,  are  rugged,  of  small  size,  and  have  a large  ac- 
ceptance angle  and  high  spectral  resolution.  Because  of  these  factors,  the  vapor  filters 
are  expected  to  be  superior  to  presently  available  devices  in  pollution  monitoring  and 
other  applications  which  require  simultaneous  multichannel  spectrochemical  analysis  for 
trace  elements.  They  are  also  expected  to  be  useful  in  the  detection  of  isotope  concen- 
trations, as  a narrow-bandwidth  ultraviolet  mirror  for  high  intensity  radiation,  and  in 
astronomical  observations.  Tests  performed  on  a mercu^  vapor  filter  verified  the 
theoretical  predictions.  With  a center  wavelength  of  2537a,  the  filter  has  shown  a peak 
reflectance  of  40%,  an  acceptance  angle  of  7 degrees,  and  a bandwidth  of  only  0.  14A. 

The  filter  was  used  in  an  emission  flame  photometer  to  perform  a spectrochemical  anal- 
ysis for  mercury  in  a water  solution.  The  device  was  also  used  to  take  a monochromatic 
photograph  of  the  spatial  distribution  of  excited  mercury  in  the  flame  from  which  an  esti- 
mate of  the  flame  temperature  profile  could  be  made. 

This  device  principle  is  being  extended  to  elements  which  are  more  difficult 
to  use  since  they  require  a higher  operating  temperature.  An  example  of  this  type  of 
element  is  sodium.  If  a successful  filter  can  be  built  using  sodium,  the  same  structure 
can  be  used  to  contain  a variety  of  elements.  At  present  the  fabrication  of  a cell  uti- 
lizing a nickel  housing  and  sapphire  windows  is  nearing  completion. 

The  purpose  of  the  program  on  integrated  optics  is  to  develop  film-layered 
structures  and  other  micro-optic  devices  for  eventual  application  in  optical  communica- 
tions. It  began  with  an  investigation  of  the  coupling  and  progression  of  optical  beams 
along  and  through  thin  films  and  other  optical  waveguides.  Planar  and  cylindrical 
polarization-independent  waveguides  have  been  investigated  and  were  shown  to  be  prac- 
tical. Such  waveguides  Ccin  support  non-polarized  beams  and  can  transmit  information 
at  twice  the  rate  of  ordinary  waveguides.  For  planar  waveguides  it  was  found  that  phase 
and  group  velocity  matching  of  a TE  and  a TM  signal  can  be  achieved  by  the  addition  of 
only  a single  thin-film  layer.  Such  a scheme  can  also  be  used  in  nonlinear  interaction 
applications  to  match  the  group  velocity  of  two  optical  signals  of  different  wavelengths 
and/or  field  distributions.  Results  analogous  to  those  found  for  planar  waveguides 
have  also  been  found  for  cladded  fibers.  An  approximate  (but  general)  transmission 
line  equivalent  which  we  have  developed  for  this  class  of  structures  reduces  considerably 
the  analytical  complexity  of  the  problem  and  allows  the  determination  of  the  modal 
dispersion  characteristics  in  a fairly  straightforward  manner.  Moreover,  we  have 
found  a realizable  radially- variable  refractive  index  distribution  n(r)  for  which  all 
rotationally- symmetrical  TE  and  TM  modes  have  the  same  group  velocity  and  spatial 
power  distribution.  The  modes  and  eigenvalues  of  such  an  optical  fiber  and  its  disper- 
sion properties  were  found. 
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The  beam-shift  phenomenon  which  occurs  when  a beam  is  incident  on  a planar 
structure  at  the  angle  corresponding  to  a leaky  wave  supportable  by  the  structure  was 
mentioned  in  the  section  on  Electromagnetics.  This  phenomenon  was  investigated 
theoretically  in  quite  general  terms,  and  the  conditions  for  optimum  beam-to-surface- 
wave  coupling  due  to  optical  beams  of  finite  width  were  derived  for  both  uniform  thin- 
film  couplers  and  grating  couplers. 

The  techniques  developed  by  us  previously  for  treating  periodic  structures 
have  been  combined  with  the  beam -shift  considerations  and  applied  to  the  detailed  anal- 
ysis and  design  of  planar  grating  couplers.  These  integrated-optics  components  serve 
the  important  function  of  coupling  an  optical  beam  into  (or  out  of)  a surface  wave  guided 
by  thin  films.  In  particular,  an  exact  solution  of  the  boundary- value  problem  posed  by 
dielectric  gratings  has  been  formulated  and  a computer  program  has  been  set  up  to 
generate  highly  accurate  numerical  results.  In  this  manner,  the  coupling  efficiency, 
operating  conditions  and  other  data  have  been  determined  for  a large  variety  of  grating 
couplers.  In  addition,  a simplified  perturbation  procedure  requiring  much  shorter 
computation  times  has  also  been  developed  and  found  to  yield  excellent  approximations 
of  the  exact  results.  These  techniques  are  being  used  to  determine  dimensions  for  a 
class  of  blazed  gratings  that  are  capable  of  a substantially  higher  coupling  efficiency 
than  that  of  ordinary  gratings.  All  of  these  techniques  are  instrumental  in  providing  a 
range  of  simple  criteria  for  the  design  of  dielectric  gratings  having  desirable  diffraction 
and  waveguiding  properties. 

A laboratory  facility  has  been  established  and  experiments  are  now  in  progress 
to  verify  some  of  the  analytical  results  and  to  assess  the  performance  of  certain  micro- 
optic devices.  The  facility  includes  a capability  to  produce  solution-deposited  films 
with  a view  towards  making  active  devices,  and  a scanning  electron  microscope  which 
has  been  modified  to  write  patterns  for  optical  waveguides,  gratings  and  circuits.  A 
computer-controlled  digital  interface  was  developed  to  generate  and  control  the  desired 
scanning  pattern. 

A model  has  been  developed  to  describe  the  exposure  and  development  charac- 
teristics of  photoresist.  The  accuracy  of  this  model  has  been  verified  and  the  model 
has  been  applied  to  the  fabrication  of  grating  couplers.  The  couplers  so  constructed 
were  found  to  have  properties  that  agree  well  with  the  predictions  of  the  theory  men- 
tioned above. 

Several  additional  theoretical  investigations  have  recently  been  initiated 
relating  to  the  area  of  integrated  optics.  One  of  these  is  an  extension  of  the  beam-shift 
considerations,  mentioned  in  the  Electromagnetics  section,  involving  lateral  waves 
(rather  than  leaky  waves,  which  are  appropriate  to  the  grating  couplers  mentioned  above). 
These  lateral  ray  and  beam  shifts  on  boundaries  with  incidence-angle-dependent  reflec- 
tion coefficients  are  currently  receiving  special  attention  because  it  has  been  recognized 
that  the  correct  ray-optical  model  for  modal  propagation  in  films  and  layers  requires 
laterally  shifted  paths.  A systematic  study  of  lateral  shifts,  both  longitudinal  (Goos- 
Hanchen  type)  and  transverse  (Imbert  type),  for  isotropic  and  anisotropic,  homogeneous 
and  graded-index,  two-dimensional  and  three-dimensional,  guiding  regions  is  presently 
in  progress. 

Another  new  theoretical  area  involves  the  determination  of  the  propagation 
characteristics  of  linear  waveguiding  structures  for  integrated  optics.  Under  investi- 
gation are  two  waveguides  which  are  composed  of  metal-clad  regions  on  thin  films 
located  on  a substrate.  One  of  these  guides  consists  of  a slot  between  two  metal-clad 
regions;  a basic  constituent  mode  was  neglected  in  analyses  performed  elsewhere,  and 
we  find  that  important  performance  differences  occur  in  a certain  frequency  range. 
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The  other  waveguide  is  a novel  one  consisting  of  a metal  strip  on  a thin  film  on  a 
substrate.  This  guide  employs  the  "plasmon"  mode  and  is  lossier  than  other  known 
linear  waveguides,  but  it  seems  to  be  dispersionless  over  a large  frequency  range  and 
may  therefore  be  of  distinct  interest.  The  properties  of  these  two  guides  are  being 
examined  in  greater  detail. 

An  additional  theoretical  topic  is  an  application  of  the  technique  for  the  tracking 
of  inhomogeneous  fields,  mentioned  in  the  Electromagnetics  section.  As  indicated 
there,  application  is  being  made  to  modal  propagation  in  graded-index  optical  films 
and  fibers;  this  new  approach  permits  the  treatment  of  a wider  class  of  refractive  index 
variations  than  can  be  handled  by  other  techniques. 

Several  novel  thin-film  devices  have  been  devised  and  investigated  both  theo- 
retically and  experimentally.  Thin-film  polarizers,  beam  splitters,  and  (fixed  and 
tunable)  narrow-bandwidth  spectral  filters  for  the  visible  and  10-micron  region  have 
been  investigated  and  were  shown  to  be  feasible  in  both  the  usual  optical  beam-shaping 
applications  and  in  integrated  optics  applications,  where  these  signal  shaping  compo- 
nents can  become  part  of  the  coupling  scheme.  Experimental  tests  have  confirmed  the 
theoretical  results.  A five-layer  FTR  narrow-bandwidth  spectral  filter  and  a single- 
layer FTR  beamsplitter  with  transmission  properties  which  are  independent  of  the 
polarization  of  the  incident  beam  were  developed  and  tested  and  their  properties  were 
found  to  agree  with  the  theoretical  predictions. 

Another  new  project  involves  the  analysis  of  certain  geometrical  discontinuity 
problems.  As  a first  example,  we  have  solved  an  important  discontinuity  problem  for 
a class  of  planar  optical  waveguides;  the  discontinuity  consists  of  a change  in  height  of 
a dielectric  layer.  In  our  modal  expansion  approach,  which  is  fairly  general,  we  place 
the  open  waveguide  structure  between  two  conducting  planes  and  then  allow  the  separa- 
tion between  the  planes  to  reach  infinity.  This  greatly  reduces  the  complexity  of  the 
problem  and  permits  the  solution  of  a wider  class  of  problems  than  is  possible  with 
other  methods  found  in  the  literature.  The  solution  we  have  found  for  single-mode 
waveguides  agrees  well  with  those  found  in  the  literature  by  less  general  and/or  more 
cumbersome  methods. 

The  investigation  of  the  properties  of  unstable  resonators  is  continuing.  Two 
independent  approaches  are  employed;  one  involves  a longitudinal  approach  whereas  the 
other  uses  a transverse,  or  waveguide,  viewpoint,  discussed  briefly  in  the  section  on 
Electromagnetics.  Using  the  longitudinal  approach,  a meaningful,  approximate  solution 
has  been  found  for  the  complete  set  of  modes  and  associated  eigenvalues  of  optical 
resonators  filled  with  an  active  medium  having  transverse  variations  in  both  the  gain 
and  refractive  index.  This  has  led  to  a simple  physical  picture  which  permits  substan- 
tial new  insight  into  the  mechanism  of  operation  of  such  resonators  and  will  help  in  the 
determination  of  the  optimum  resonator  parameters  which  are  to  be  used  in  high-power 
laser  devices. 

For  unstable  resonators  whose  transverse  boundaries  are  determined  by  the 
mirror  dimensions,  the  waveguide  approach  to  unstable  resonators  formed  by  hyper- 
bolic strip  mirrors  has  been  found  successful  in  explaining  in  a remarkably  simple 
manner  the  complicated  behavior  of  the  complex  eigenvalues  of  the  resonant  modes. 
These  results  are  now  being  extended  to  circular  mirror  configurations  and  attention  is 
being  given  to  the  effects  of  inhomogeneity  in  the  resonator  medium. 
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D.  QUANTUM  ELECTRONICS 
Program  Director:  W.  T.  Walter 

The  research  program  in  the  area  of  Quantum  Electronics  comprises  that 
portion  of  electronics  in  which  quantum  mechanical  effects  become  important  because 
of  the  discrete  nature  of  charge  carriers,  matter  or  radiation.  Optics  and  Wave- 
Matter  Interactions  are  related  research  areas  whose  boundaries  with  Quantum  Elec- 
tronics are  not  distinct.  Some  of  the  research  previously  reported  in  the  Quantum 
Electronics  area  has  been  shifted  in  this  report  to  the  Optics  or  Wave-Matter  Inter- 
action research  areas. 

A research  program  has  been  developed  which  has  attacked  outstanding  prob-  j 

lems  of  quantum  electronics  on  a broad  front.  Significant  results  include:  the  develop-  ! 

ment  of  a new  type  of  optical  and  infrared  spectroscopy  which  has  demonstrated  more 
than  two  orders  of  magnitude  improvement  in  resolution  compared  to  previous  tech- 
niques, a corresponding  improvement  in  the  long-term  frequency  stabilization  of  lasers,  | 

significant  progress  in  the  understanding  of  the  irreducible  quantum  fluctuations  which  | 

are  present  in  the  output  of  lasers  and  other  quantum  electronic  devices,  advances  in 
the  understanding  of  the  propagation  of  ultra-short  pulses  in  resonant  media,  progress  j 

in  the  development  of  a new  wide-angle  low-noise  laser  receiver,  an  improved  under-  | 

standing  of  the  properties  of  the  acousto-optical  interaction  and  the  use  of  this  inter- 
action to  provide  acoustical  gain  and  also  to  mode  lock  a ruby  laser,  a significant  I 

improvement  in  the  brightness  and  in  the  average  power  generation  density  of  the  j 

very-high-gain  pulsed  copper  vapor  laser,  construction  of  a computer  model  to  analyze  ■ 

these  pulsed  metal  vapor  lasers,  and  the  development  of  a procedure  for  calculating 
improved  values  of  atomic  transition  probabilities.  , 

One  group  of  projects  has  centered  on  the  study  of  the  nonlinear  response  of 
saturable  absorbers  to  intense  laser  beams.  These  projects  have  yielded  a new  type 
of  ultra-high-resolution  spectroscopy,  significant  improvement  in  the  long-term  fre-  , 

quency  stabilization  of  lasers,  and  have  also  contributed  to  the  progress  on  an  auto- 
matically-adapting filter  which  offers  the  promise  of  a significant  improvement  in  the 
performance  of  an  optical  radar  receiver.  Laser  Saturated  Resonance  Spectroscopy  j 

(LSRS)  has  been  applied  to  the  study  of  the  absorption  spectrum  of  the  SF^  molecule  at 
10.  and  the  I2  and  Br^  molecules  at  0.  6328pm.  A resolving  power  of  10^  was  i 

demonstrated,  representing  a two  order  of  magnitude  improvement  over  conventional 
spectroscopic  techniques.  The  study  of  the  LSRS  of  xenon  has  now  been  completed. 

All  of  the  observed  components  in  the  LSRS  spectrum  of  a naturally  occurring  mixture 
of  xenon  isotopes  (masses  128,  129,  130,  132,  134  and  136)  at  3.  5pm  have  been  re- 
solved and  identified.  The  narrow  Lamb  dips  have  also  been  used  to  stabilize  the  , 

frequencies  of  both  the  CO2  (10.  6 pm)  and  He-Ne  (0.  6328  pm)  lasers.  A laser  which 
is  frequency  stabilized  to  a Lamb  dip  is  a strong  candidate  for  selection  as  an  optical 
(or  infrared)  wavelength  standard,  and  will  also  be  useful  for  long-range  interferometry, 
for  deep-space  communication  links,  for  coherent  optical  radar,  and  for  seismic  studies.  ;■ 

The  high  intensities  achievable  in  laser  beams  can  perturb  atomic  energy  levels 
and  radically  alter  the  position  and  shape  of  spectral  lines.  The  theory  of  these  phenom-  !i 

ena  is  being  extended  and  refined.  Recent  studies  have  centered  on  the  inclusion  of  j 

finite  lifetime  and  recoil  effects.  New  results  were  obtained  on  the  effect  on  the  molec-  j 

ular  center-of-mass  motion  of  a resonant  driving  field.  The  analysis  exploited  a novel  | 

calculation  technique  based  on  the  application  of  Floquet's  theorem  to  periodically-  1 

driven  quantum-mechanical  systems.  It  is  hoped  that  these  interesting  effects  at  optical  j 

frequencies  will  eventually  be  observed  and  compared  with  theory.  | 
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Another  theoretical  investigation  in  progress  aims  at  understanding  the  irre- 
ducible quantum  fluctuations  always  present  in  the  output  of  lasers  and  other  quantum 
electronic  devices,  from  the  point  of  view  of  first  principles.  A study  has  been  com- 
pleted of  the  effect  of  an  internal  homogeneously-broadened  saturable  absorber  on  the 
intrinsic  fluctuations  of  a laser  oscillator.  A related  effort  is  concerned  with  the  under 
standing  of  spontaneous  emission  processes  in  resonant  dispersive  configurations  other 
than  free  space.  This  problem  has  been  solved  in  general  terms  by  relating  it  to  the 
problem  of  calculating  thermal  field  distributions.  Recently,  the  spontaneous  emission 
noise  and  the  response  characteristics  of  driven  quantum  electronic  systems  have  been 
evaluated  for  classes  of  spatially  degenerate  molecules.  Effects  associated  with  the 
nonstationary  character  of  the  correlation  functions  have  been  particularly  emphasized. 
These  effects  have  been  overlooked  in  recently  published  work  on  the  response  of  driven 
systems. 


The  study  of  the  propagation  of  ultra-short  pulses  in  resonant  media  has  led  to 
the  discovery  of  new  relationships  between  their  spatial  and  temporal  structures.  These 
include:  strong  self  focusing,  phase  modulation  which  accompanies  transverse  con- 

finement of  the  pulse,  additional  unsuspected  structure  of  plane  wave  pulses,  some 
effects  implied  by  the  inclusion  of  a finite  transverse  relaxation  time,  the  transverse 
stability  of  confined  fields,  new  types  of  distortionless  fields  which  simultaneously 
possess  phase  modulation,  polarization  modulation  and  three-dimensional  structure, 
and  some  effects  produced  by  a nonlinear  index  of  refraction  of  the  host  material. 

The  spectroscopic  investigations  of  saturable  absorbers  and  the  study  of 
propagation  of  ultra-short  pulses  in  resonant  media  relate  directly  to  the  feasibility 
study  of  the  low-noise  wide-angle  BALAD  (Bleachable  Absorber  Laser  Amplifier- 
Detector)  receiver.  In  this  receiver,  an  absorption  cell  (filled  for  example  with  SF^,) 
is  placed  between  the  laser  amplifier  and  the  detector  to  serve  as  a threshold  filter, 
both  in  the  frequency  domain  and  in  two  spatial  dimensions,  and  it  screens  out  all  back- 
ground light  which  is  not  coherent  with  the  signal  beam.  Calculations  predict  superior 
performance  in  optical  radar  and  communications  applications.  The  LSRS  studies  have 
clarified  the  effect  on  laser  amplifier  gain,  bandwidth  and  band  center  that  would  result 
from  the  use  of  separated  single  xenon  isotope  samples  in  an  optical  radar  system  with 
a BALAD  receiver. 

Visible  and  near-visible  lasers  with  an  efficiency  and  power  comparable  to  the 
far  infrared  CO,  laser  are  still  lacking.  The  vapors  of  certain  metals  and  transition  ele- 
ments have  attractive  energy  level  structures.  A number  of  high-temperature  laser 
systems  have  been  constructed  to  explore  the  most-promising  metal  vapor  laser  media. 
The  pulsed  copper  vapor  laser  has  high  gain  and  high  peak  power  at  5106A.  Moreover, 
unlike  other  self-terminating  lasers,  single  mode  output  and  therefore  high  peak  bright- 
ness can  be  obtained.  Recent  experiments  using  higher  pulse  repetition  rates  and  a 
larger  volume  have  produced  an  average  power  of  11  watts.  With  argon  as  an  additive 
gas,  average  powers  above  one  watt  have  been  achieved  at  5106A  from  an  active  region 
only  10  cm  long  and  l.ZScm  in  diameter,  corresponding  to  an  average  power  production 
of  0,  1 watts  per  cubic  centimeter.  A heat-pipe  copper  vapor  laser  has  been  demonstra- 
ted, and  preliminary  experiments  with  a high-loss  resonator  have  yielded  a peak  bright- 
ness of  10^^  watts  per  square  centimeter  per  steradian,  which  is  the  highest  brightness 
measured  for  a gas  laser.  Heat-pipes  and  the  use  of  dissipated  discharge  energy  for 
self-heating  have  potential  for  practical  sealed-off  copper  vapor  lasers  with  average  out- 
put powers  up  to  100  watts  in  the  visible.  Still  higher  output  powers  and  overall  elec- 
trical efficiencies  of  up  to  are  anticipated  from  the  current  investigation  of  imped- 
ance tailoring  of  copper  vapor  systems.  Other  metal  vapor  systems  are  being  examined 
for  efficient  laser  action  in  the  ultraviolet  and  blue-green  spectral  regions.  A computer 
model  has  been  constructed  and  is  being  used  in  conjunction  with  the  experimental  anal- 
ysis of  several  metal  systems.  Theoretical  analysis  of  these  systems  for  higher  effi- 
ciency lasers  has  led  to  the  development  of  a procedure  for  calculating  imp.-oved  values 
of  atomic  transition  probabilities  from  spectral  intensities. 
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E.  SOLID  STATE  AND  MATERIALS 


Program  Director;  H.  J.  Juretschke 


The  solid  state  research  ranges  over  a wide  area  of  experimental  activities 
encompassing  the  creation  of  new  or  better  materials,  the  investigation  of  many  specific 
properties  of  solids,  and  the  application  of  these  materials  and  effects  to  practical  sit- 
uations. In  addition,  there  are  active  programs  in  various  aspects  of  solid  state  theory. 


In  the  area  of  materials  development,  an  important  sector  is  concerned  with 
new  magnetic  compounds  involving  rare  earth  and  transition  metal  ions,  and  with  the 
production  of  phosphors  for  IR -to- visible  conversion.  Programs  involving  thin  films 
include  the  development  of  high-mobility  semiconducting  layers,  and  the  preparation  of 
highest  quality  single  crystal  films  of  the  noble  and  ferromagnetic  metals  and  the  semi- 
metals. In  addition,  there  are  systematic  efforts  to  produce  good  insulating  barriers  on 
semiconductor  and  metal  single  crystal  surfaces,  for  application  to  tunneling  and  other 
devices. 


Among  the  properties  of  interest,  those  related  to  electromagnetic  interactions 
play  a central  role.  Electron  transport  in  semiconductors  is  studied  for  hot-electron 
nonlinear  effects,  and  for  surface  interactions.  Size  and  surface  effects  in  carrier  trans- 
port in  magnetic  fields,  especially  as  related  to  the  band  structure  and  to  surface  relax- 
ation mechanisms,  are  investigated  in  thin  films.  One  group  of  studies  centers  on  the 
spin  dependence  of  electrical  conduction  in  magnetic  materials,  while  another  is  con- 
cerned with  the  characteristics  of  thin  tunneling  junctions,  either  vacuum  or  insulating, 
between  normal  or  superconducting  materials.  New  research  includes  the  interaction  of 
x-rays  with  electrons  in  semiconductors  through  the  internal  photoelectric  effect,  the 
determination  of  charge  concentrations  at  metal-insulator  interfaces  by  surface  electron 
scattering,  the  details  of  metallic  surface  self  diffusion  as  it  affects  the  electrical  sur- 
face properties,  and  the  concentration  profiles  in  alloys  of  thin  films  as  affected  by  the 
large  surface-to- volume  ratio. 


Studies  in  magnetism  include  a study  of  the  magnetic  coupling  between  transition 
metal  ions  in  complex  fluorides,  using  neutron  diffraction,  magnetic  susceptibility  and 
Mbssbauer  spectroscopy.  Magnetic  resonance  measurements  (NMR,  ESR)  are  being 
used  to  study  clustering  of  rare-earth  ions  doped  in  CdF2.  This  material  has  been  devel- 
oped as  a base  for  infrared  to  visible  light  conversion,  which  has  potential  application  in 
a variety  of  display  devices.  Another  study  is  looking  at  the  origins  of  the  frequency 
dependence  of  magnetic  anisotropy  in  ferrites  and  at  the  anisotropy  of  their  magnetogyric 
ratio.  A program  in  magnetic  resonance  of  ESR  and  NMR  is  elucidating  details  of  elec- 
tronic structure  in  many  of  the  materials  used  in  other  investigations. 


Optical  investigations  include  a broad  investigation  of  the  Faraday  effect  in 
ferromagnetic  metals  that  aims  at  clarifying  details  of  their  band  structure,  especially 
as  a function  of  temperature.  At  very  short  wavelengths,  in  the  x-ray  region,  studies 
are  pursued  on  quantitative  aspects  of  the  resonant  Borrmann  transmission  in  perfect 
single  crystals  under  conditions  where  three  or  more  distinct  beams  are  in  strong  inter- 
action. Here  an  analysis  of  loss  mechanisms  has  been  critical,  and  we  have  initiated 
work  on  better  determination  of  x-ray  dispersion  coefficients,  and  on  the  energy  distri- 
bution of  the  propagating  waves  relative  to  the  atomic  sites. 


X-ray  investigations  are  concerned  with  crystal  structures,  perfection,  thermal 
effects  and  charge  distributions  in  a wide  range  of  materials.  Determination  of  lattice 
constants  through  multiple  diffraction  measurements  are  yielding  data  of  considerably 
enhanced  accuracy  and  precision.  Multiple  diffraction  effects  are  also  being  applied  in 
a careful  interpretation  of  x-ray  data,  including  linear  absorption  coefficients,  line 
shapes  and  line  shifts. 
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Theoretical  studies  include  a fundamental  program  in  electron-photon  inter- 
actions in  magnetic  fields,  and  the  theory  of  slow-electron  scattering  in  solid  surfaces. 

In  addition,  there  is  work  on  elastic  and  inelastic  interactions  between  electromagnetic 
radiation  and  atoms  and  solids  involving  excitation  of  inner  electrons,  especially  as  they 
apply  in  the  x-ray  region  and  on  the  phase-sensitive  coupling  of  optical  fields  in  solids 
to  suppress  noise.  Finally,  the  nonlinear  coupling  between  x-rays  and  other  waves  is 
studied  in  the  domain  of  dynamic  x-ray  modes. 

Another,  and  new,  aspect  of  solid  state  studies  involves  the  theoretical  and 
experimental  investigation  of  distributed  parameter  networks  on  integrated  electronic 
circuits.  There  has  always  been  a search  for  ways  to  utilize  an  existing  technology  to 
its  fullest  advantage.  By  using  a distributed  parameter  viewpoint  on  silicon  planar 
technology,  it  is  possible  to  synthesize  functions  on  chips  which  are  much  smaller  and 
less  expensive  than  chips  containing  lumped  elements  only.  The  result  of  initial  tests 
have  shown  the  feasibility  of  manufacturing  totally-integrated  voltage-tunable  RF  ampli- 
fiers, IF  strips,  FM  demodulators,  oscillators,  and  other  frequency-selective  electronic 
networks.  The  elimination  of  large  lumped  inductors  itself  results  in  an  appreciable 
saving  in  size,  weight,  material,  and  ultimately,  cost.  The  design  is  simple  and  well 
suited  to  mass  production  techniques  in  silicon  technology. 


F.  WAVE-MATTER  INTERACTIONS 
Program  Director:  N.  Marcuvitz 


Several  different  programs  comprise  the  area  of  wave-matter  interactions.  The 
most  comprehensive  and  challenging  of  these  programs  is  a relatively  new  one  which  in- 
volves the  interaction  between  very  high  power  electromagnetic  waves  and  materials,  in 
which  the  interactions  are  highly  nonlinear  and  turbulent.  Other  programs  have  been 
underway  which  treat  interactions  appropriate  to  more  moderate  energy  levels;  such 
interactions  involve  quasi-linear  processes  which  can  be  treated  either  by  equations 
possessing  weak  nonlinearity  or  by  linearized  equations,  the  latter  leading  to  a large 
variety  of  parametric  processes  which  have  been  studied  by  us  in  detail  over  a period  of 
time.  The  above-mentioned  studies  are  primarily  theoretical  but  they  have  experimental 
phases  as  well.  Other  programs  include  various  experimental  and  theoretical  investiga- 
tions of  plasma  wave  properties  and  plasma  turbulence  effects,  and  studies  in  space 
radiophysics,  which  include  terrestrial  and  extraterrestrial  phenomena. 

A comprehensive  program  has  been  initiated  to  study  phenomena  associated 
with  the  interaction  of  high  power  density  electromagnetic  waves  and  materials.  Its 
motivation  is  to  extend  known  techniques  on  linear  wave  propagation  in  spatially  inhomo- 
geneous and  time  varying  deterministic  media  to  practical  structures  wherein  the  mate- 
rial and  waves  are  nonlinear,  inhomogeneous,  time  varying,  and  turbulent.  Of  particular 
interest  are  self-consistent  studies  of  reflection,  transmission,  and  mode  conversion 
properties  of  high  power  waves  as  a function  of  frequency,  pulse  width  (spatial  and  tem- 
poral), etc.  , for  the  range  of  power  densities  wherein  material  properties  become 
markedly  nonlinear  and  experience  phase  changes.  Preliminary  model  computer  studies 
are  being  carried  out  with  the  aid  of  an  interactive  graphic  interpretive  (IGI)  language 
developed  for  a PDPll  computer  facility;  related  experimental  and  theoretical  investiga- 
tions are  also  under  study. 

Another  program,  which  is  also  relatively  recent,  involves  weakly-nonlinear 
interactive  systems;  the  approach  adopted  employs  a scattering  formulation  applied  to 
nonlinear  transmission  lines.  A recent  result,  involving  the  general  problem  of  reducing 
a nonlinear  equation  to  a linear  one,  has  been  to  construct,  in  a simple  and  straightfor- 
ward manner,  a general  such  transformation  for  a class  of  generic  equations,  of  which 
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the  well-known  Burger's  equation  and  the  K-deV  equation  are  special  cases.  Another 
important  result  has  been  to  show  on  a rigorous  basis  that  the  energy  in  a lossless 
linear  transmission  line  is  conserved,  in  contradiction  to  the  published  literature  which 
characterizes  the  presumed  violation  of  the  conservation  law  as  a paradox. 

The  program  on  parametric  processes,  which  involve  the  linearization  of 
weakly-nonlinear  interactions,  has  been  highly  successful  over  an  extended  period  and 
now  includes  application  to  nonlinear  optics  and  to  processes  involving  plasmas.  Re- 
cent theoretical  studies  have  stressed  the  interaction  of  light  with  plasmas,  including 
the  study  of  stimulated  Brillouin  scattering  (SBS)  and  parametric-decay  instabilities. 
Each  of  these  nonlinear-optical  interactions  involves  the  ion-acoustic  wave  in  the  plas- 
ma. The  dependences  of  the  instabilities  on  the  dispersion,  absorption,  anisotropy  and 
inhomogeneity  of  the  medium  are  of  particular  concern.  The  results  of  this  investiga- 
tion have  practical  applications  in  laser-produced  plasmas  for  fusion  and  ionospheric 
probing  by  radio  waves.  Other  applications  of  parametric  interactions  are  found  in  the 
development  of  new  coherent  optical  sources  for  communications,  and  in  design  of  pulse- 
echo  devices  for  data  processing.  Theoretical  studies  have  employed  a Floquet  expan- 
sion procedure  which  takes  all  the  space-time  harmonics  into  account,  and  is  rigorous 
within  the  smcall-sign.al  regime.  The  rigorous  boundary- value  problem  and  the  initial 
value  problem  for  instability  evolution  for  an  intense  light  beam  incident  on  a med  um 
are  being  solved  and  applied  to  a variety  of  physical  cases.  Instability  thresholds  in 
unbounded  regions  have  been  further  investigated  for  interesting  cases  where  two  inter- 
actions (involving  three  frequencies)  merge  into  one  interaction  (involving  four  frequen- 
cies) upon  application  of  a small  change  in  model  parameters.  Such  merger  of  interac- 
tions appears  likely  in  certain  parametric  interactions  taking  place  in  gradient  inhomo- 
geneities. The  study  of  a class  of  problems  dealing  with  intracavity  parametric  inter- 
actions has  been  initiated,  prompted  by  interesting  experimental  results  obtained  in 
mode  locking  a ruby  laser  using  stimijlated  Brillouin  scattering  in  a birefringent  crystal. 
In  other  experiments,  the  dynamic  build-ups  of  SRS  and  SBS  phonons  in  crystals  are 
being  studied  on  a time  and  space  resolved  basis  in  order  to  illustrate  some  of  the  theo- 
retical predictions  mentioned  above. 

The  research  activities  related  to  plasmas  include  the  examination  of  plasma 
waves  and  the  study  of  plasma  turbulence.  Both  theoretical  investigations  and  laboratory 
experiments  are  being  conducted  in  these  areas.  Both  linear  properties  and  nonlinear 
effects  of  plasma  waves  have  been  and  are  being  examined,  and  related  experiments  are 
being  performed  to  verify  the  theoretical  findings.  Among  the  topics  which  have  been 
treated  are:  a)  the  propagation  characteristics  of  electrostatic  plasma  waves  in  a 
magneto-plasma;  b)  plasma  heating  using  the  resonance  damping  of  cyclotron  waves; 

c)  evolution  of  parametrically-excited  ion-acoustic  instabilities  and  plasma  heating; 

d)  propagation  of  kilovolt-subnanosecond  base-band  pulses  along  a plasma  column; 

e)  mixing  of  electromagnetic  waves  in  a plasma  through  modulation  of  the  electron  tem- 
perature; f)  coupling  effects  between  electromagnetic  modes  and  electrostatic  modes; 

g)  examination  of  the  wave  properties  related  to  microwave  interferometry  in  finite-size 
plasmas;  and  h)  resonance  excitation  of  plasma  waves  by  a slotted  cavity.  The  effort  in 
this  area  is  being  continued  and  coordinated  with  the  study  of  plasma  turbulence. 

The  study  of  plasma  turbulence  has  been  gaining  more  emphasis  recently.  A 
comprehensive  theoretical  restructuring  of  plasma  turbulence  has  been  carried  out,  and 
new  experimeital  facilities  were  designed  and  built  to  perform  sophisticated  experiments 
on  plasma  turbulence.  One  of  the  new  experimental  facilities  is  a long,  hollow-cathode 
arc  system,  which  can  produce  both  quiescent  and  weakly-turbulent  highly-ionized  plas- 
mas. Not  only  were  the  static  characteristics  of  weak  plasma  turbulence  studied,  but 
the  dynamic  effects  of  the  transition  from  the  quiescent  state  to  the  weakly-turbulent 
states  were  also  observed.  Advanced  data  processing  techniques  were  developed  to  gain 
reliable  data  on  plasma  fluctuations.  One  of  the  distinguished  achievements  in  this  area 
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is  the  clear  quantitative  evidence  of  the  plasma  turbulence  generated  by  the  electrostatic 
ion-cyclotron  waves  and  of  the  turbulent  diffusion  process.  Dynamic  or  feedback  stabi- 
lization techniques  are  being  applied  to  isolate  the  various  terms  contributing  to  the 
growth  rate  of  the  instability  and  the  transition  into  the  weak  turbulence  regime. 

Associated  with  the  investigation  on  plasma  physics  is  the  research  in  space 
radiophysics  of  natural  terrestrial  and  extraterrestrial  phenomena.  Emphasis  is  on  the 
physics  of  the  upper  atmospheres  and  ionospheres  of  the  planets  and  their  interaction 
with  the  sun's  emanations.  The  program  involves  theoretical  and  experimental  studies 
of  disturbances  in  the  ionosphere  as  produced  by  man-made  events  and  by  natural  phe- 
nomena, such  as  thunderstorms.  Our  facilities  permit  the  sounding  of  the  ionosphere 
for  disturbance  detection.  Radio  emissions  at  VLF  as  detected  by  satellites  have  also 
been  studied.  The  problem  of  the  coupling  between  waves  in  the  neutral  atmosphere  and 
waves  in  the  ionosphere  is  presently  being  studied.  Resonances  have  been  discovered 
and  it  is  planned  to  seek  confirmation  both  from  spacecraft  data  and  from  ground-based 
measurements.  Dynamic  interaction  of  the  atmosphere  and  ionosphere  of  planets  due  to 
heating  by  solar  radiation  has  also  been  studied  from  the  standpoint  of  the  outflow  of 
gases  and  ionization  as  related  to  the  formation  of  the  solar  system.  These  methods 
have  been  applied  to  the  Saturnian  Satellite  Titan  and  to  Jovian  satellites,  which  are 
unique  in  the  Solar  System. 

G.  ELECTRIC  POWER  ENGINEERING 
Program  Director:  E.  Levi 

The  recent  energy  problems  and  the  increased  awareness  of  the  deterioration 
of  the  environment  are  likely  to  bring  about  a significant  increase  in  the  share  of  energy 
utilized  in  electric  form.  In  response  to  these  mentioned  needs,  the  Polytechnic  has 
initiated  a new  research  program,  and  has  enriched  its  academic  curriculum  by  insti- 
tuting electric  power  engineering  options  at  both  the  undergraduate  and  graduate  levels. 

Under  investigation  are  many  critical  areas  in  the  generation,  transmission, 
and  distribution  of  electrical  power.  Of  particular  interest  are:  (1)  the  magnetic  separa- 
tion of  radioactive  isotopes  in  nuclear  fuels  and  waste  products;  (2)  the  application  of 
novel  technologies  to  the  development  of  individual  components,  such  as  generators, 
high  voltage  transmission  lines,  gas  insulated  substations,  circuit  breakers,  fault- 
current  limiters,  rectifiers  and  inverters;  (3)  the  stability,  reliability,  and  economy  of 
large  integrated  systems;  and  (4)  their  effect  on  the  environment  and  the  quality  of  life. 

At  the  utilization  end,  transportation  has  been  singled  out  as  the  primary  area 
of  concern,  since  it  consumes  about  one  quarter  of  the  overall  U.  S.  energy  budget  and 
one  half  of  the  oil,  and  since  it  creates  serious  pollution  problems.  In  the  U.S.  only 
one  percent  of  the  railroad  is  electrified,  as  compared  with  more  than  90  percent  in  the 
other  industrialized  countries  of  the  world.  Besides  seeking  improvement  in  pa  senger 
and  freight  train  services,  new  modes  for  individual,  as  well  as  mass  transportation  are 
being  developed.  These  include  electrified  urban  thoroughfares,  as  well  as  highways. 

A significant  effort  is  devoted  to  linear  propulsion  by  means  of  iron-cored  synchronous 
motors.  These  motors  present  challenging  problems  in  electromagnetics  because  of 
their  complex  geometries;  a recent  study  involves  a new  method  for  the  determination  of 
magnetic  fields  in  air  gaps.  Another  problem  which  arises  in  such  linear  motors  is  the 
need  to  achieve  a more  accurate  assessment  of  eddy  current  losses  and  field  penetration 
in  thickly-laminated  iron  structures.  In  addition,  as  a result  of  pioneering  work  con- 
ducted at  the  Polytechnic,  novel  topologies  for  flux  inter-linkages  have  been  conceived. 
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II.  SYSTEMS 

A.  COMMUNICATIONS 
Program  Director:  L.  Kurz 

The  research  program  in  communications  and  information  processing  currently 
covers  topics  in  the  optimization  and  evaluation  of  digital  data  transmission  systems, 
robust  detection  and  estimation  techniques,  and  FM  receivers. 

The  transmission  of  digital  data  in  the  presence  of  intersymbol  interference  and 
noise  is  a classical  problem  in  communications.  The  problem  has  recently  taken  on  an 
even  more  important  significance  as  data  rates  become  increasingly  higher  and,  there- 
fore, the  effects  of  intersymbol  interference  and  noise  more  severe.  The  main  research 
effort  in  this  area  was  concentrated  on  the  optimum  and  suboptimum  signalling  and  detec- 
tion in  the  presence  of  intersymbol  interference  and  gaussian  noise,  detection  in  the 
presence  of  noise  represented  by  a mixture  distribution  model,  and  techniques  useful  in 
the  monitoring  of  channel  conditions.  In  particular,  a new  design  procedure  for  recur- 
sive minimum  mean-square  error  equalizers  was  developed,  two  types  of  detectors 
using  digital  filters  were  investigated,  suboptimum  detectors  for  signals  corrupted  by 
gaussian  and  impulsive  noise  were  analyzed,  and  two  classes  of  direct  and  indirect 
robust  estimators  of  channel  conditions  were  introduced  and  compared  to  existing  pro- 
cedures. Work  is  continuing  on  all  the  above-described  areas  of  digital  data  system 
optimization. 

Considerable  interest  generated  in  detection,  feature  extraction  and  estimation 
problems,  where  little  is  known  about  the  data  to  be  processed,  motivated  the  develop- 
ment in  recent  years  of  several  classes  of  rank  and  non-rank  robust  (insensitive  to 
underlying  distributions)  detection  and  estimation  procedures.  Though  the  rank  detectors 
tend  to  be  more  powerful,  its  non-rank  competitors  are  more  robust  and  simpler  to  im- 
plement. Thus,  the  main  research  effort  was  concentrated  on  developing  further  the 
latter  class  of  detectors  stressing  applications  in  the  detection  of  non-constant  signals 
in  mixture  distributions  and  fading  environments,  sequential  and  nonsequential  detection 
of  underwater  sounding  and  two-dimensional  data.  A parallel  effort  included  the  develop- 
ment of  robustized  estimation  techniques  which  are  applicable  in  adaptive  modes  of 
operation  in  data  processing.  In  particular,  a new  family  of  variable-threshold  non-rank 
procedures  was  introduced  and  compared  to  competitors  based  on  the  rank  procedures: 
slippage  and  analysis  of  variance  techniques  were  modified  to  operate  on  an  adaptive 
mode,  a new  class  of  quadratic  three-sample  partition  detectors,  which  is  useful  on 
change-of-scale  and  stochastic  ordering  detection  problems,  was  generated  and  shown  to 
be  an  effective  and  easy-to-implement  class  of  competitors  to  the  rank  detectors;  various 
techniques  were  developed  for  finding  optimum  scores  and  thresholds  for  partition  detec- 
tors; two  basic  theorems  pertaining  to  robustized  recursive  estimation  were  proven. 

Work  in  all  the  areas  of  robust  detection  and  estimation  described  above  will  be  continued. 

Another  study  is  concerned  with  investigating  a novel  FM  demodulator  capable 
of  combatting  the  effects  of  interfering  signals.  This  demodulator,  consisting  of  two 
phase-locked  loops  interconnected  in  a feedback  arrangement,  has  been  constructed  and 
is  being  studied.  Further  theoretical  and  experimental  analysis  will  continue. 
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B.  COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 
Program  Director:  E,  J,  Smith 

The  research  program  in  Computers  and  Computer-Communication  Networks 
covers  a number  of  topics  directed  toward  the  development  of  improved  computer 
architecture,  improved  languages  or  information  structures,  techniques  for  imple- 
menting algorithms,  data  networks,  and  message  switching  systems. 

In  computer  commu-oications,  we  are  concerned  primarily  with  techniques  for 
the  better  understanding  of  and  improved  design  of  data  and  computer  networks.  The 
stress  is  on  message  store-and-forward  networks  with  minicomputers  used  to  combine 
or  concentrate  incoming  messages,  to  buffer  or  store  them  if  need  be,  to  route  messages 
either  dynamically,  or  following  prescribed  routing  algorithms.  There  are  interesting 
questions  here  of  the  appropriate  modeling  of  interconnected  networks  of  computers  to 
carry  out  the  necessary  analytical  work  and  of  the  appropriate  choice  of  message  statis- 
tics to  be  used  in  the  modeling.  The  combining  or  concentration  function  is  being  studied, 
with  comparisons  made  between  different  combining  techniques.  Dynamic  buffer  schemes 
are  being  studied,  with  optimum  buffer  size  and  comparison  of  various  schemes  a speci- 
fied goal.  From  the  overall  network  viewpoint  a comparison  of  various  adaptive  routing 
algorithms  is  underway,  as  well  as  methods  for  maintaining  flow  control  throughout  the 
network.  Also  being  studied  are  algorithms  for  network  design  taking  into  consideration 
topology,  capacities,  routing,  reliability,  and  other  factors. 

Tnree  specific  projects  have  been  completed  which  deal  with  adaptive  routing 
and  resource  allocation,  multiple  routing  networks,  and  resource-sharing  in  computer- 
communication  nodes.  The  present  effort  focuses  in  particular  upon  teleprocessing  net- 
work design  algorithms,  channel  assignment  in  mobile  telecommunication  systems,  and 
adaptive  message-switching  networks. 

Investigations  of  the  dynamic  behavior  of  computer-driven  communication  sys- 
tems are  concerned  with  the  development  of  models  which  might  be  useful  as  analytical 
tools  in  the  evaluation  of  such  systems  as  well  as  in  the  ultimate  design  of  improved 
systems.  In  order  to  provide  a focus  for  the  effort,  a particular  message-switching 
system  is  studied  in  detail  and  the  approach  is  directed  toward  the  development  of  an 
interconnectivity  model  in  which  the  system  is  viewed  as  a collection  of  program  modules, 
each  of  which  communicates  with  other  modules  through  an  interconnection  medium  via 
a particular  machine.  This  often  obscures  the  original  intent  of  the  designer  of  the  sys- 
tem and  provides  no  insights  into  the  necessary  data  structures  and  operations  for  per- 
forming message-switching  functions.  Motivated  by  these  considerations,  current  work 
is  directed  toward  the  development  of  a higher-level  algorithmic  description  language  for 
use  in  the  design  and  specification  of  switched  communication  systems.  Desirable  char- 
acteristics of  such  a language  are  defined  and,  as  an  initial  step,  the  ideas  are  applied 
to  the  description  of  a communications  scanner  channel.  The  modules  are  characterized 
by  various  parameters  such  as  core  size,  execution  time,  data  base,  communication 
linkages  to  other  modules,  etc.  ; and  the  application  of  the  model  is  viewed  with  respect 
to  the  evaluation  of  system  changes  and  debugging  difficulty,  throughput,  and  local  opti- 
mization of  individual  models.  A first  version  of  a communications -oriented  design 
language  has  been  completed  and  a current  effort  is  concerned  with  the  investigation  of 
appropriate  data  structural  forms  for  efficient  implementation  of  the  language.  Also 
recently  completed  was  an  approximate  analytic  model  of  a message-switching  system 
which  provides  the  capability  of  predicting  message-processing  delays  from  a knowledge 
of  the  message  switch  architecture  and  the  traffic  statistics,  as  well  as  permitting  one  to 
estimate  the  effect  on  performance  of  transferring  a task  from  one  processor  to  the  other 
in  a dual-processor  system.  The  current  effort  attempts  to  extend  the  model  to  a multi- 
processor system  consisting  of  microprocessor  modules. 
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In  the  area  of  switching  theory,  a dynamic  fault-test  generation  scheme  has 
been  developed  for  combinatorial  logic  circuits,  and  an  extension  of  the  same  technique 
is  being  applied  to  the  case  of  an  asynchronous  sequential  machine  in  which  tests  are 
generated  from  the  acyclic  circuit  after  a near-minimum  number  of  feedback  cuts  have 
been  made  in  the  original  circuit.  Dynamic  tests  are  sought  rather  than  static  tests 
since  the  former  do  not  require  checking  and  consequently  require  less  complex  compu- 
tation. Techniques  are  also  being  investigated  for  the  determination  of  partitions  having 
the  substitution  property,  or  other  properties  of  special  interest,  through  use  of  the 
predicate  calculus;  efficient  algorations  for  computation  are  sought. 

In  the  area  of  machine  architecture,  several  studies  are  in  progress,  including 
the  exploration  of  a simple,  low-cost,  bus-oriented  computer  having  a comtnon  memory 
and  instruction  format  for  microprograms  and  programs  as  an  array  processor,  and  the 
investigation  of  a variable-precision  arithmetic  technique  in  which  all  numbers  are 
stored  internally  within  the  computer  as  quotients  of  integers.  The  resulting  effect  upon 
computational  accuracy  and  machine  time  will  be  studied;  other  related  schemes  will  be 
sought  out  and  explored. 

C.  SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 
Program  Director:  M.  L.  Shooman 

The  areas  of  Safety,  Reliability  and  Software  Engineering  encompass  a broad 
spectrum  of  analytical  modeling,  experimental  systems  research,  and  systems  engineer- 
ing. The  underlying  thread  of  cohesion  in  these  areas  is  the  application  of  modern  tech- 
niques of  probabilistic  modeling,  statistical  measurement  and  experimentation,  and  the 
development  of  design  and  optimization  techniques  within  the  engineering  process.  The 
emphasis  is  at  times  on  the  basic  development  of  the  tools  and  methodology  for  such 
studies  and  often  on  the  application  of  these  new  as  well  as  existing  techniques  to  the 
advancement  of  the  state  of  knowledge  in  the  given  area. 

This  section  describes  and  reports  the  work  of  several  diverse  groups  within 
the  Institute,  some  who  are  developing  technology  in  areas  outside  of  electronics,  but 
with  direct  application  to  electronic  devices  and  systems,  and  others  who  are  working 
within  the  mainstream  of  electronics. 

To  the  probabilistic  modeler,  in  the  abstract,  the  concepts  of  safety,  reliability 
and  availability  have  a unifying  thread.  The  probability  of  no  equipment,  human,  or  pro- 
cedural failure  which  endangers  a human  being  is  expressed  in  the  safety  index.  Simi- 
larly, if  the  failure  described  in  the  preceding  sentence  affects  system  operation  so  as  to 
cause  a failure  we  are  discussing  reliability.  Clearly,  safety  and  reliability  analysis 
have  the  same  methodology;  however,  the  definitions,  implications,  and  design  goals 
differ  significantly.  The  availability  index  is  a probability  which  measures  the  percent- 
age of  systems  which  are  up  at  any  specified  time  and  allows  one  to  model  and  measure 
the  effectiveness  of  failure-repair  dynamics.  Maintainability  is  measured  by  repair 
rate,  and  repair  and  failure  rate  combine  through  the  system  configuration  to  determine 
availability.  Lastly,  when  we  turn  our  attention  to  software  we  find  that  the  concepts  of 
system  reliability,  availability,  and  maintainability  apply  well;  however,  it  is  important 
to  emphasize  the  differences  between  hardware  and  software  in  constructing  our  models. 

Activities  in  the  safety  area  have  recently  been  stimulated  and  accelerated  by 
the  arrival  of  new  funding  from  the  Department  of  Transportation  for  the  second  year  of 
a proposed  three-year  intermodal  study  of  transportation  system  safety.  This  work  is  an 
interdisciplinary  effort  with  several  participating  departments.  Clearly,  although  much 
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of  the  transportation  vehicle  and  its  guideway  involves  mechanical  systems,  a very 
high  and  increasing  percentage  of  modern  ships,  aircraft,  rail,  and  automobile  systems 
involve  electronic  control,  communications,  sensing',  guidance,  and  so  on.  One  of  the 
chief  methodological  tools  being  applied  to  this  area  is  the  system  fault  tree.  The  most 
important  theoretical  aspects  are  its  construction,  computerization,  and  collection  of 
probabilistic  input  data.  In  this  latter  area  some  new  techniques  of  statistical  estima- 
tion called  consensus  estimation  are  being  developed.  These  differ  from,  but  bear 
resemblance  to,  the  Delphi  techniques  used  in  forecasting.  Also  of  very  great  impor- 
tance is  the  modeling  of  the  human  operator.  In  a transportation  system  it  is  impossible 
to  divorce  the  human  operator  from  the  control  system.  Although  a great  deal  has  been 
done  to  model  the  human  transfer  function,  researchers  have  had  less  success  in  mod- 
eling human  errors  which  vitally  affect  both  reliability  and  safety.  We  have  enlisted  the 
aid  of  an  experimental  psychologist  to  work  with  the  systems  engineers  in  this  area  of 
research  and  hope  that  the  cross  fertilization  of  ideas  will  lead  to  new  approaches. 

The  reliability  (including  availability  and  maintainability)  area  has  been  a focus 
of  activity  for  many  years.  Many  topological  methods  of  system  analysis  have  been 
developed  along  with  approximate  bounds  which  have  served  as  the  basic  algorithms  for 
many  of  the  computer  analysis  programs  produced  in  recent  years.  Also,  much  has 
been  done  in  applying  Markov  models  to  a wide  variety  of  problems  in  availability  com- 
putation. Recent  thrusts  involve  (1)  a queueing  theory  approach  to  availability  and  main- 
tainability which  allows  more  complex  repair  models;  (2)  modeling  of  the  inspection 
interval  problem  (applied  to  critical  systems),  so  as  to  optimize  the  benefits  in  safety 
and  minimize  the  cost;  and  (3)  the  study  of  computer-assisted  test  of  electronic  systems 
with  regard  to  design  constraints,  optimum  test  points,  and  overall  cost  minimization 
and  reliability  maximization. 

In  the  area  of  software  engineering,  a substantial  portion  of  this  effort  is  sup- 
ported by  the  Rome  Air  Development  Center.  Activities  in  this  area  are  focused  upon 
five  aspects  of  software  reliability;  a)  construction  of  probabilistic  models  for  software 
errors,  which  reflect  the  content  and  type  of  error,  removal  and  generation  rate,  pre- 
diction of  mean  time  between  failure  and  reliability;  b)  measures  of  computational  com- 
plexity based  upon  the  algorithm,  automata  theoretic  or  graph  theoretic  complexity, 
information  content  and  program  size;  c)  relationship  of  modular  and  structured  pro- 
gramming styles  to  reliability;  d)  modeling  techniques  for  program  verification  and 
testing  including  optimal  test  strategies  for  nonexhaustive  tests;  e)  models  for  the  com- 
parison of  programming  languages,  comparative  study  of  formal  definition  languages, 
and  techniques  for  compiler  writing. 

The  probabilistic  modeling  effort  dealt  with  the  construction  of  many-state 
Markov  models  for  the  determination  of  software  availability  and  macro  models  to 
predict  number  of  bugs  and  program  reliability.  Preliminary  models  were  developed 
to  predict  operational  software  reliability.  In  the  second  phase  of  this  work,  error- 
generation  terms  were  added  to  increase  the  degree  of  realism.  A new  micro  approach 
relates  errors  and  reliability  to  a functional  path  decomposition  of  the  program. 

Several  approaches  are  being  taken  to  provide  measures  of  complexity,  both 
for  the  problem  and  the  ensuing  software.  The  initial  efforts  treated  the  complexity 
of  a function  via  recursive  function  theory.  Present  efforts  are  directed  toward 
relating  program  complexity  to  established  ideas  in  the  fields  of  automata,  communica- 
tions and  information  theory.  Recent  results  provide  a measure  of  program  length 
based  on  Zipf's  Law  and  operator /ope rand  count. 
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In  the  area  of  structured  and  modular  programming  two  efforts  are  being  pur- 
sued. One  effort,  presently  in  progress,  applies  modular  programming  techniques  to 
the  automatic-interactive  construction  of  programs.  In  addition,  the  statistical  design  of 
an  experiment  for  the  objective  evaluation  of  structured  vs,  nonstructured  programs  is 
in  the  planning  phase. 


In  the  program  test  area,  two  efforts  are  in  progress.  The  objective  of  the 
first  effort  is  to  develop  a method  which  computes  the  number  of  tests  necessary  to 
verify  a computer  program.  Verification  is  categorized  into  several  classes,  including 
"exhaustive"  at  one  extreme  and  "processing  of  each  instruction  at  least  once"  at  the 
other.  In  the  second  effort,  an  analytic  method  is  being  developed  for  the  selection  of 
data  for  automatic  program  testing. 


In  the  area  of  languages,  a new  effort  has  begun  to  construct  a very  high  level 
language  for  writing  program  specifications.  It  is  envisioned  that  this  language  will  lie 
between  a high  level  programming  language  (FORTRAN,  PL/l,  etc.)  and  English  prose, 
but  will  be  concise  and  yet  avoid  ambiguity. 


D.  SYSTEMS,  CONTROL  AND  NETWORKS 
Program  Director:  D.  C.  Youla 


Research  activity  in  the  network  area  encompasses  both  the  classical  and 
modern  lumped-distributed  domains.  In  the  latter,  several  significant  breakthroughs 
have  been  made  which  are  expected  to  lead  to  an  exact  insertion-loss  synthesis  technique 
for  the  design  of  optimum  filters  incorporating  cascades  of  equi-delay  TEM  lines  and 
lumped,  lossless  two-ports.  In  particular,  it  is  expected  that  transformers  and  filters 
exhibiting  equi-ripple  performance  in  both  pass  and  stop  bands  can  be  designed  to  speci- 
fications, In  principle,  there  appears  to  be  no  reason  why  the  method  cannot  be  extended 
to  single-moded  waveguide  networks  employing  obstacles  to  produce  the  lumped  discon- 
tinuities. An  unexpected  and  important  outgrowth  of  the  above  study  has  been  the  devel- 
opment of  a new  diagnostic  digital  technique  for  probing  one-dimensional  nonuniform 
structures.  Hopefully,  the  modeling  of  dispersion  by  means  of  lumped  two-ports  will 
enlarge  the  scope  of  the  method  significantly. 

A major  topic  in  the  control  area  involves  the  application  of  classical  ideas  to 
the  design  of  feedback  controllers  for  linear  multivariable  systems.  All  efforts  on  this 
topic  have  been  completely  successful.  A totally  new  frequency-domain  technique  has 
been  developed  for  the  design  of  optimal  multivariable  controllers.  The  class  of  prob- 
lems that  fall  within  the  scope  of  this  technique  is  much  broader  than  that  encompassed 
by  the  linear,  quadratic  gaussian  approach  (LQG).  The  latter  revolves  around  the  idea 
of  Kalman  filtering  and  for  this  reason  is  unable  to  absorb  in  a natural  and  straightfor- 
ward manner  many  essential  practical  constraints.  The  group  is  presently  engaged  in 
work  which  should  result  in  an  effective  computer  implementation  of  the  optimal  con- 
troller. The  availability  of  such  an  algorithm  will  undoubtedly  suggest  related  simpler 
suboptimal  strategies  and  open  the  door  to  significant  industrial  applications. 

Work  on  the  control  of  stochastic  systems  has  been  focused  on  situations  in 
which  discrete  events,  such  as  equipment  failures  or  message  arrivals,  occur  at  random 
times.  Recursive  optimization  equations  have  been  derived  in  many  cases  via  dynamic 
profiramming.  Novel  techniques  have  been  devised  to  solve  those  equations  to  get  opti- 
mal decision  rules  for  diverting  messages  or  vehicles  to  alternate  routes  in  situations 
with  simple  arrival  and  service  time  probability  distributions.  In  more  complex  cases, 
structural  properties  of  the  optimal  controllers  have  been  derived  and  specific  control 
algorithms  have  been  compared  via  simulation. 
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In  the  area  of  reliability  applications,  both  repairman  assignment  and  optimal 
inspection  problems  have  been  solved.  The  problems  studied  here  have  been  of  a higher 
order  cf  complexity  than  those  generally  formulated  by  system  analysts.  In  one  case, 
both  feedback  control  for  dynamic  tracking  accuracy  and  repairman  assignment  policies 
were  simultaneously  optimized  for  repairable  systems.  In  another,  techniques  were 
developed  for  optimizing  inspection  schedules  when  the  corresponding  tests  degrade  the 
lifetimes  of  the  very  components  whose  status  is  being  checked. 

Identification  and  parameter  estimation  from  random  data  is  of  central  interest 
in  systems  applications.  In  control  applications,  one  must  have  better  knowledge  about 
the  plants  as  well  as  effective  models  of  certain  plant  elements  or  functions  for  which 
there  are  no  intrinsic  analytical  descriptions  available.  This  is  especially  true  when 
human  elements  are  in  the  loop  and  we  try  to  describe  neuromascular  systems  such  as 
arm  and  leg  control,  pupil  response,  etc.  We  also  require  identification  and  modeling 
techniques  when  the  system  itself  is  too  complex  so  that  its  fundamental  input-output 
properties  are  unknown.  This  is  especially  true  of  large-scale  economic  systems  where 
dynamic  models  are  obtained  in  some  optimal  fashion  from  available  data  in  order  to 
describe  the  evolution  of  the  system  as  a result  of  changing  policies.  Finally,  even  in 
the  case  for  which  dynamic  equations  describing  a system  are  known  or  accepted, 
parameters  are  often  unknown  and  mast  be  estimated.  Identification  and  estimation 
techniques  have  been  applied  to  all  of  the  problems  described  above.  But,  to  a great 
extent,  the  basic  feature  of  previous  applications  of  identification  and  parameter  estima- 
tion is  that  the  models  of  the  systems  studied  have  been  time  invariant,  and  their  param- 
eters have  been  estimated  from  statistically  stationary  data  (after  trends  have  been 
removed).  However,  there  exist  important  phenomena  for  which,  even  after  trends  have 
been  removed,  the  data  is  intrinsically  non-stationary,  characterized  usually  by  a period 
of  increasing  and  then  decreasing  intensities.  Such  characteristics  are  found  in  eco- 
nomic systems,  but  are  even  more  prevalent  in  geophysical  observations  such  as  mete- 
orological fluctuations  and  geological  phenomena  such  as  earthquake  excitations.  Moti- 
vated by  this  last  application  to  characterize  the  statistical  properties  of  earthquake 
excitations,  recent  studies  have  led  to  new  limit  theorems  for  estimating  significant 
parameters  in  models  of  non-stationary  time  series,  and  has  led  to  a specific  approach 
for  modeling  earthquake  acceleration  data.  This  approach  has  been  implemented  on 
computers  and  is  being  applied  to  study  recent  earthquake  acceleration  data  obtained 
from  the  Western  United  States.  We  are  just  at  the  beginning  in  developing  techniques 
for  the  identification  of  non-stationary  systems  and  data.  Many  problems  remain  to  be 
solved. 


E.  DATA  PROCESSING 
Program  Director:  A.  Papoulis 


Current  studies  in  data  processing  include  the  following  aspects  of  picture  (or 
tabular)  processing  and  spectral  estimation:  statistical  enhancement  and  extraction 
techniques  for  pictorial  or  tabular  data;  reduction  of  diffraction  effects  in  the  imaging 
of  coherent  and  incoherent  objects;  image  distortion  resulting  from  atmospheric  turbu- 
lence; and  statistical  analysis  in  spectral  estimation. 


Statistical  enhancement  and  extraction  techniques  for  pictorial  or  tabular  data: 
This  research  concentrates  on  developing  procedures  to  present  the  data  in  a useful 
form  or  to  present  the  data  in  some  way  in  which  factors  in  the  data  which  are  "almost 
invisible"  become  plainly  visible.  The  effort  was  concentrated  on  factor  analysis  and 
masking  techniques.  The  masking  operation  was  taken  to  mean  the  process  in  which  a 
'Svindow"  or  "mask"  is  swept  across  the  data  either  electronically,  mathematically. 
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mechanically,  or  some  combination  of  these.  The  motives  for  selection  of  a particular 
mask  are  simplicity  and  effectiveness.  The  preliminary  results  indicate  that  both  ap- 
proaches to  the  enhancement  and  extraction  problems  of  tabular  data  are  promising,  and 
the  research  effort  utilizing  these  and  related  techniques  will  be  continued. 

Reduction  of  diffraction  effects  in  the  imaging  of  coherent  and  incoherent  ob- 
jects: The  known  methods  of  deconvolution  in  one  and  two  dimensions  for  analog  and 

digital  data  are  compared  for  accuracy  and  computational  economy.  A new  technique 
is  developed  for  the  complete  recovery  of  objects  of  finite  size.  The  technique  is  based 
on  an  iteration  scheme  involving  only  the  FFT.  It  is  shown  that,  in  the  absence  of 
noise,  the  iteration  converges  to  the  unknown  object.  The  effects  of  noise  and  the  ali- 
asing errors  are  determined  and  it  is  shown  that  they  can  be  controlled  by  early  termina- 
tion of  the  iteration. 

Image  distortion  resulting  from  atmospheric  turbulence:  In  the  recent  litera- 
ture, a method  has  been  proposed  for  a dynamic  correction  of  this  distortion.  The 
underlying  filtering  (Poisson  noise)  and  prediction  (control  delay)  problem  leads  to  a 
two-dimensional  Wiener-Hopf  equation  whose  solution  is  under  investigation  under 
various  assumptions  about  the  spectrum  of  the  turbulence.  The  reverse  problem  of 
determining  the  properties  of  the  medium  in  terms  of  image  of  a moving  sattelite  has 
also  been  considered. 

Statistical  analysis  in  spectral  estimation:  The  current  emphasis  is  on  the 
statistical  analysis  of  the  method  of  maximum  entropy  and  in  comparison  with  other 
methods  for  various  special  forms  of  the  unknown  spectra.  An  adaptive  perturbation 
scheme  is  under  consideration  for  updating  the  estimated  spectra  as  new  information  is 
received.  The  convergence  of  the  scheme  to  the  smoothed  spectrum  is  under  examina- 
tion. 
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EXCITATION  OF  LARGE  CONCAVE  SURFACES 
L.B.  Felsen,  A.  Green  and  A.  Hessel 

While  the  excitation  of,  and  propagation  along,  convex  surfaces  has  been  thorough- 
ly explored  in  the  technical  literature,  much  less  attention  has  been  given  to  the  cor- 
responding problems  for  concave  boundaries.  Nevertheless,  concave  configurations 
are  of  importance  in  a variety  of  applications  including  ground  wave  propagation  over 
terrain  with  smooth  hills  and  depressions,  scattering  from  reflectors  and  similar  open 
structures,  scattering  from  configurations  backed  by  large  cylindrical  or  spherical 
cavities,  and  mutual  coupling  in  dome-shaped  conformal  arrays.  Of  special  interest, 
and  most  difficult  to  analyze,  are  the  fields  observed  on  or  near  the  surface  when  the 
source  is  also  situated  on  or  near  the  surface. 

The  most  fundamental  difference  between  the  excitation  of  concave  and  convex 
surfaces  or  large  (compared  to  the  wavelength)  radius  of  curvature  is  the  absence  in 
the  former  of  a geometrical  shadow  region,  from  which  the  source  is  invisible.  This 
circumstance  gives  rise  not  only  to  a more  intricate  geometric -optical  field  comprising 
multiply  reflected  rays  (Fig.  1)  but  also,  in  an  alternative  guided  wave  description,  to 
the  presence  of  whispering  gallery  modes  which  cling  to  the  surface  (Fig.  2)  and,  in 
the  absence  of  dissipation,  experience  no  attenuation.  Their  counterparts  on  a convex 
surface,  the  creeping  waves,  lose  energy  by  tangential  shedding  along  the  propagation 
path.  The  problem  is  complicated  further  by  the  fact  that  ray  optics  is  incapable  of 
providing  the  field  solution  since  the  caustics  of  multiply  reflected  rays  pile  up  near  the 
boundary  and  thus  invalidate  the  geometric -optical  field  evaluation  there.  A field 
representation  in  terms  of  whispering  gallery  modes  only  (with  inclusion  of  a continuous 
mode  spectrum  for  some  surfaces),  while  valid,  is  inconvenient  for  calculation  for 
large  separation  of  source  and  observation  points  on  a large -radius  surface  since  the 
number  of  modes  required  can  be  substantial.  These  problems  do  not  arise  on  a con- 
vex surface  where  the  distant  field  in  the  shadow  region  is  represented  compactly  by 
the  dominant  creeping  wave. 

To  gain  a better  physical  as  well  as  quantitative  understanding  of  these  aspects 
of  wave  propagation  on  a concave  surface,  intensive  studies  have  been  carried  out  on 
the  simplest  prototype  configuration,  the  interior  of  a perfectly  reflecting  circular 
cylinder  excited  by  an  axial  line  source.  ’ ’ A peculiarity  of  the  cylindrical  geometry 

is  the  presence,  in  addition  to  the  whispering  gallery  modes,  of  a continuous  guided 
mode  spectrum  which  arises  because  of  spurious  reflections  from  the  radial  coordinate 
origin.  Elimination  of  these  spurious  contributions  leads  to  an  asymptotic  field  repre- 
sentation in  terms  of  an  integral  which  can  be  manipulated  so  as  to  exhibit  ray-optical 
contributions,  whispering  gallery  mode  contributions,  a mixture  of  these,  or  a 
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formulation  containing  a reduced  canonical  integral  analogous  to  the  Fock  integral  for 
convex  surfaces.  The  most  effective  choice  depends  on  the  parameters  of  the  problem 
The  results  so  obtained  can  be  generalized  to  apply  to  arbitrary  concave  surface  shape 
provided  that  the  radius  of  curvature  changes  slowly  over  a wavelength  interval.  The 
validity  of  these  variable -curvature  solutions  has  been  verified  by  comparison  with 

4 

exact  calculations  performed  for  a parabolic  contour. 

We  have  extended  the  analysis  to  accommodate  a surface  impedance  boundary 
condition.  This  is  required  when  dealing  with  ground  wave  propagation  and  also  with 
the  concave  array  mutual  coupling  problem.  Moreover,  the  results  in  the  literature 
are  inadequate  for  tracking  the  field  to  the  vicinity  of  the  source  or,  when  source  and 
observation  points  are  fixed,  to  the  limit  of  very  large  radius  of  curvature.  For  both 
cases,  we  have  obtained  a formulation  in  terms  of  the  field  for  a flat  boundary  plus 
curvature  dependent  correction  terms. 

For  the  two-dimensional  perfectly  conducting  circular  cylinder,  the  relevant 
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propagation  effects  between  source  and  observation  points  located  on  the  boundary  are 
contained  in  the  integral 


Gq(£.  £)  = 


^ / — 

i(irka)  C 


e 

T2T 


l<t>  - <|)' 


(ka)  J^(ka) 


dv  , £ = (p,  <|)) 


(1) 


where  k is  the  free-space  wavenumber,  the  prime  on  the  cylinder  functions  denotes  the 
derivative  with  respect  to  the  argument,  and  a time  factor  exp( -iwt)  is  implied.  It  has 
been  assumed  in  Eq.  (1)  that  the  normal  derivative  of  the  Green's  function  vanishes  at 
p=a;  this  makes  Gq  proportional  to  the  axial  component  of  magnetic  field.  The  contour 
C and  the  singularities  of  the  integrand  in  the  complex  v-plane  are  shown  in  Figure  3. 


Fig.  3.  Integration  path  and  singularities  in 
complex  v-plane. 


Contributions  from  the  pole  singularities  arising  from 

(ka)  = 0 , m = 1,2,..  . (2) 

m 

are  found  to  describe  whispering  gallery  modes.  While  Eq.  (2)  has  an  infinite  mumber 
of  real  solutions  as  indicated  in  Fig.  3,  only  those  with  Re  > 0 represent  spectral 
contributions  in  the  angular  transmission  representation  which  includes  also  a continu- 
ous spectrum. 

The  alternative  representations  noted  above  are  obtained  from  Eq.  (1)  by  various 
analytical  procedures  including  contour  deformation,  residue  evaluation,  saddle  point 
techniques,  traveling  wave  expansion  of  l/j^(ka),  partial  summation  of  the  resulting 
series,  and  the  like.  For  most  of  these  manipulations,  it  suffices  to  consider  the 
simplified  form  of  Eq.  (1), 
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where  s = a|4>  -4)'|  denotes  the  arc  length  between  source  and  observation  points,  and 
the  contour  C is  inferred  from  C by  the  mapping  v = ka  + (ka/2)^^^t.  In  the  near  field 
or  infinite  plane  limits  (i.e.  , for  small  y),  Eq.  (3)  may  be  reduced  to 
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where  the  factor  outside  the  braces  represents  the  Green's  function  for  an  infinite  plane 
surface.  When  the  curvature  between  fixed  source  and  observation  points  changes  from 
concave  to  convex  (i.e.  , "a"  is  transformed  continuously  into  (-a)),  one  may  show  that 
Gq  in  Eq.  (3)  becomes 
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dt 
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the  known  Green's  function  for  a convex  boundary.  Here,  w^=  Ai  -iBi. 

Generalization  to  variable  radius  of  curvature  a(s)  involves  replacement  of  Y by 

Xs  .2  1/3 

^ [f((t)l  f(s)  = [ka(s)/2]  ' , and  other  related  changes.  To  accommodate 

the  impednace  boundary  condition,  one  generalizes  the  integral  in  Eq.  (1)  by  employing 
- iZ  H^^^)  and  (J^  - iZ  J^)  instead  of  and  J^,  respectively,  with  Z denoting 

the  normalized  surface  impedance. 

U.S.  Army  Research  Office 
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RADIATION  BY  SOURCES  IN  A UNIDIRECTIONALLY  CONDUCTING  PLANE 
A.  Hessel,  S.  Siddiqi  and  J . Shmoys 

A spiral  antenna  is  a possibly  useful  phased  array  element.  It  is  likely  to  have  rela- 
tively high  bandwidth  and  scan  volume  and  can  be  made  inexpensively.  In  order  to  study  the 
performance  of  such  an  element  the  boundary  value  problem  for  a sheathmodel  of  the 
multi-arm  spiral  was  formulated  as  a Galerkin -type  procedure.  In  order  to  getan  under- 
standing of  the  convergence  of  the  Galerkin  procedure  in  this  type  of  boundary  value  prob- 
lem, we  formulated  a closely  related  two-dimensional  problem  in  which  the  number  of 

2 

space  harmonic  modes  is  N (rather  than  N for  the  three  dimensional  spiral  structure) . This 
reduces  the  computer  time  and  permits  the  inclusion  of  a large  number  of  space  harmonics. 

In  a spiral  element  the  circular  source  region  is  surrounded  by  a ring  shaped 
sheath  region  in  which  conduction  occurs  along  the  wires  only  (cf.  Figures  1 (a),(c)). 

The  surface  current  in  the  source  region  is  prescribed.  The  corresponding  two-dimen- 
sional model  has  a strip  source  region  and  a strip  sheath  region  on  each  side  of  the 
source,  as  shown  on  Figures  1(b),  (d).  In  b t''  cases  the  boundary  value  problem  must 
be  formulated  so  that  for  a given  source  current  distribution  the  component  of  the  elec- 
tric field  along  the  conduction  direction  must  vanish  in  the  x-y  plane.  In  both  cases  it 
is  simplest  to  follow  the  procedure  outlined  below: 

(1)  Expand  the  unknown  current  distribution  in  the  conduction  (sheath)  region  in 
a suitable  set  of  functions  and  truncate  the  expansion.  This  introduces  a 
finite  number  of  coefficients  to  be  determined. 

(2)  In  terms  of  the  unknown  coefficients  defined  in  (1),  the  magnetic  field  in  the 
plane  z = 0"*’  is  now  completely  defined;  hence  the  electric  field  can  be  calculat- 
ed. 

(3)  The  component  of  the  electric  field  in  the  conduction  direction  now  must  be 
set  to  zero.  The  essence  of  the  Galerkin  procedure  is  to  project  the  function 
describing  this  field  component  onto  the  finite  dimensional  function  space 
defined  in  (1)  for  the  representation  of  the  current  distribution.  This  will 
vield  as  many  equations  as  there  are  unknown  coefficients. 

(4)  The  set  of  equations  is  then  solved  and  the  physical  quantities  of  interest  cal- 
culated. 

An  alternative  procedure  can  be  used  instead  of  (3)  and  (4)  above:  if  the  rms  value  of 
the  tangential  component  of  the  electric  field  is  calculated  and  then  minimized  with 
respect  to  all  of  the  parameters  used  in  the  representation  of  the  current  distribution, 
the  procedure  is  guaranteed  to  converge. 

In  this  investigation  we  used  three  different  representations  of  the  current  density 
distribution  with  the  Galerkin  methocland  a "quasi-static"  representation  in  the  varia- 
tional method.  For  simplicity  only  broadside  phasing  will  be  considered,  although  the 
otension  to  other  radiation  directions  is  straightforward. 
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Fig.  1.  Spiral  and  strip  sheath  elements. 

A.  Formulation  of  the  Problem 

Let  us  consider  a unit  cell  as  shown  in  Figs.  1 (b),(d),  with  the  source  surface 
current  being  given: 

Js  = 2JoZo  - |yl<V2  (1) 

By  symmetry,  the  transverse  magnetic  field  in  the  z = 0+  plane  is 

H=-yZxJ=Jx  , |yl<a/2  (2) 

The  thin  wires  in  the  conducting  sheath  region  are  characterized  by  perfect  conductivity 
in  the  wire  direction,  t^  = cos  a x^  + sin  and  zero  conductivity  in  the  perpendicular 

direction,  so  that  the  transverse  magnetic  field  in  the  sheath  region  is 

= -z^  xJ^^H(y)  , a/2  < |y|  < b/2  (3) 

where  2H  is  the  complex  current  amplitude  of  the  induced  current.  Finally,  in  the 
outer  region,  b/2  < y < d/2,  the  transverse  magnetic  field  vanishes.  From  the 
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continuity  of  current  flow  at  the  edges  of  the  conduction  region,  we  know  that 

sin  a H(±a/2)  = ; H(±b/2)=0  (4) 

In  terms  of  the  as  yet  unknown  current  distribution  H(y)  the  transverse  magnetic  field 
is  specified  in  the  entire  unit  cell.  Hence  the  current  amplitudes  of  the  space  harmonic 
expansion, 

z > 0 , Hj^(y,z)  = exp(-jk^z)  (5) 

n 

can  be  expressed  in  terms  of  the  function  H(y)  and  given  J^.  The  set  of  orthonormal 
space  harmonic  mode  functions  is  defined  in  the  usual^  manner,  with  ( denoting  TM 
modes  and  ( " ) TE  modes . 

If  the  sheath  current  density  is  now  represented  by  a Fourier  series 

H(y)  = J^-(sin  q)‘^  [f(y)  + \(y)l  (6) 

k 

The  three  representations  used  in  the  first  part  of  the  study  are  as  follows: 

(1)  4<k  sine  functions,  vanishing  at  a/2  and  b/2,  and  f(y)  = 0.  Since  we  know 
that  H(a/2  -)  = J /sin  a we  must  expect  Gibbs  phenomenon  and  its  effect  on 
convergence . 

(2)  f(y)  is  a linear  function  satisfying  continuity  conditions  at  a/2  and  b/2  and 
vanish  at  a/2  and  b/2. 

(3)  f(y)  = 0,  and  4^1^  vanish  at  b/2  but  not  at  a/2:  4*^  ” 0 at  a/2. 

Since  the  space  harmonic  expansion  is  orthonormal,  I^  and  I^^  can  be  readily  calculated 

in  terms  of  F,  's.  The  electric  field  can  then  be  calculated  as  follows: 
k 

z > 0 E^(y,z)  = ^(Z^I^e'^(y)  + exp(-jk^z)  . (7) 

n 

The  remaining  boundary  condition,  that  of  vanishing  of  the  electric  field  along  the  wires, 
E^(y,0)  - _t^  = 0 a/2<y<b/2  , (8) 

can  be  imposed  by  setting  the  scalar  products  of  the  left-hand  side  of  Eq.  (8)  with  the 
sef  of  functions  be  zero. 

>/2 

J E (y,0)  • _t^4^^(y)dy  = 0 , k=l,--,K  (9) 

a/2 

The  use  of  the  same  truncated  set  of  in  Eq.  (6)  and  in  enforcing  Eq.  (8)  is  the  essence 
of  the  Galerkin  procedure.  The  system  of  linear  equations  obtained  is  of  order  K.  The 
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coefficients  of  this  system  involve  summations  over  the  space  harmonics,  which  are 
also  truncated.  Thus  we  have 

(10) 


y A . F,  + B 

— I mk  k m 
k 


m = 1,2, 


,K 


where 


mk 


e *t>Z<e  > 

m ' — n — o n — n —o'  k 


(11) 


B 


(12) 


the  summation  extends  over  all,  TE  and  TM,  space  harmonics.  The  scalar  product 
notation  is  defined  by 

b/2  « 

< $ l4)>  = r * (y)  4'(y)<iy 

a/2 

and  I denotes  a similar  scalar  product  of  the  n-th  space  harmonic  with  the  im - 

n FEED 

pressed  magnetic  field  in  the  feed  (center)  region.  Once  the  coefficients  F^^  are  deter- 
mined, all  the  physically  relevant  quantities  can  be  readily  calculated. 

2 

In  the  variational  method  we  made  use  of  the  function 
f(y)  = u((y  -a/2)‘/^  - ((b  -a)/2)‘/2  - (y  -b/2)(2/b  -a))^/^) 

+ V((b/2-y)^/^  + (y-b/2)(2/(b-a))^/^)  - (2y-b)/(b-a) 

in  the  representation  of  the  magnetic  field,  Eq.  (6),  with  being  sine  functions.  This 
expression  contains  2 + K complex  constants  or  4+ 2K  real  parameters.  The  mean 
square  tangential  field  is  then  given  by 

.b/2.  ,2 


/ , I E.  • t = Q(U^.  V.  F^) 

a/ 2 


(14) 


a quadratic  form  in  the  parameters  used  in  Eq.  (13)  and  real  and  imaginary  parts  of  the 
Fourier  coefficients. 

The  coefficients  of  this  quadratic  form  are  sums  ofter  space  harmonics.  They 
are  obtained  by  putting  Eq.  (13)  into  Eq.  (6),  then  calculating  ^ using  Eqs.  (5)  and  (7), 
finally  putting  the  result  in  Eq.  (14)  and  collecting  all  terms  in  U^,  Uj.U^i  t etc. 

Since  the  tangential  electric  field  in  the  sheath  region  should  be  zero,  we  get  the  best 
approximation  by  minimizing  the  quadratic  form.  The  convergence  of  this  procedure 
is  self-evident.^  The  choice  of  the  function  f(y)  in  Eq.  (13)  was  dictated  by  the  fact  that 
it  satisfies  the  continuity  conditions  and  has  the  correct  quasi- static  square  root  behavior 
at  the  ends  of  the  interval.  This  procedure  was  carried  out  for  the  case  K=  1 only. 
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B.  Ehscussion  of  Results 

One  way  in  which  we  can  examine  the  convergence  of  the  method  is  by  comparing 
the  magnetic  field  distribution  calculated  using  different  numbers  of  Fourier  series 
terms  (K)  and  different  numbers  of  space  harmonics  (from  -N  to  N).  In  order  to  avoid 
relative  convergence  difficulties  the  ratio  of  N : K was  kept  at  2 which  provides  approxi- 
mately the  same  spacial  resolution  in  both  functions.  The  approximate  magnetic  field 
distribution  can  be  calculated  in  two  ways:  (1)  directly  from  the  Fourier  series  in  the 
conduction  region  (with  and  0 elsewhere)  and  (2)  from  space  harmonic  series  in  the 
entire  unit  cell.  One  indication  of  accuracy  is  the  agreement  of  the  latter  results  with 
the  former,  particularly  in  the  source  and  outer  regions  where  the  field  is  prescribed. 
Such  a comparison,  for  N=20,  K=  10,  using  sine  series,  representation  (a),  is  shown 
on  Figure  2.  Conduction  direction  in  this  and  succeeding  cases  (except  for  the  last  one) 
is  at  right  angles  to  the  elements,  q=  90°,  so  that  it  makes  no  difference  whether  the 
sheath  region  is  unidirectionally  or  omnidirectionally  conducting.  We  see  that  the  agree- 
ment between  the  two  curves  is  quite  satisfactory.  The  same  comparison  for  K=  50, 

N=  100  (Fig.  3),  is  even  better.  Figure  4 shows  a comparison  of  Fourier  series  sum  for 
representations  (2),  sine  series  with  linear  transition  term  and  (3),  cosine  series. 

Both  are  a clear  improvement  over  the  sine  series  results. 


REAL  PART 


Fig.  2.  Normalized  magnetic  field  distribution. 
N = 20  K=  10  representation  (1). 
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The  results  of  the  variational  approach  without  any  Fourier  terms,  and  with  one 
Fourier  term  added  to  the  quasi-static  transition  function  are  compared  with  those  for 
representation  (2)  (sine  series  with  linear  transition)  in  Figure  5.  We  see  the  advantage 
of  the  variational  approach  if  we  recall  that  in  these  cases  a 4x4  or  6x6  matrix  is  in- 
verted while  in  the  case  of  representation  (2)  a lOOx  100  matrix  is  involved. 

Another  way  to  compare  different  representations  is  by  the  residual  electric  field 
parallel  to  the  conduction  direction.  An  example  of  this  is  shown  in  Figure  6. 

Finally,  the  effect  of  changing  the  conduction  direction  was  studied,  using  the 
variational  procedure  with  a single  Fourier  term.  Figure  7 shows  the  effect  of  tilt  angle 
a on  induced  electric  field  in  the  source  region.  It  was  felt  that  the  peak  shown  at  20 
is  related  to  simple  resonance  --  the  length  of  the  conduction  path  being  a half  wave- 
length. This  conjecture  seems  confirmed  by  Fig.  8,  in  which  the  effect  of  changing  the 
tilt  angle  on  the  amplitude  of  the  first  Fourier  coefficient  is  shown. 

Ballistic  Missile  Defense  Systems 
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TRANSIENT  DIFFRACTION  BY  A SEMI -INFINITE  CONE 
K.K.  Chan  and  L.B.  Felsen 

The  time -dependent  response  due  to  a point  source  in  the  presence  of  a semi- 
infinite acoustically  soft  or  hard  cone  can  be  found  through  Laplace  inversion  of  the  time 
harmonic  Green's  functions.^  The  solution  obtained  in  Ref.  1,  while  generally  valid,  is 
given  as  an  infinite  series  which  is  useful  for  computation  only  in  those  parameter 
ranges  where  it  converges  rapidly;  at  long  observation  times.  In  the  following,  a dif- 
ferent approach  is  used;  we  employ  a "quasi-optic"  integral  representation  of  the  time 
harmonic  Green's  function,  which  leads  to  a new  form  of  the  time  dependent  solution. 

The  result,  which  contains  an  integral,  can  be  reduced  further  if  we  restrict  ourselves 
to  the  evaluation  of  the  scattered  field  by  a small-angle  cone,  with  the  source  located 
on  the  cone  axis.  As  will  be  shown,  solutions  for  all  observation  times  are  obtained  in 
remarkably  simple  closed  forms  involving  only  elementary  functions  provided  that  the 
observation  point  lies  in  a region  that  excludes  the  rays  reflected  from  the  cone  surface 
according  to  the  laws  of  geometrical  optics.  This  region  of  applicability  accommodates 
important  applications,  in  particular,  the  on-axis  back-scattered  fields.  To  confirm 
the  validity  of  the  closed  form  results,  the  early-time  and  long-time  behavior  of  the 
transient  field  are  evaluated  as  special  cases  and  compared,  respectively,  with  solutions 
obtained  independently  by  different  approaches.  Both  the  scalar  Dirichlet  and  Neumann 
problems,  and  the  vector  dipole  problems  (leading  to  the  dyadic  Green's  functions),  can 
be  treated  in  this  manner. 

A.  Soft  Cone  (Dirichlet  Condition) 

1.  Formulation  and  Solution 

The  time  dependent  Green's  function  G(  r,  r*;  t,  t^)  can  be  obtained  via  Laplace 
inversion  from  the  time-harmonic  Green's  function  (with  time  dependence  e suppres- 
sed throughout),  which  satisfies  the  three-dimensional  wave  equation 


(V^  + k^)  G(£,  r')  = -6(r  -r')  (1) 

and  boundary  conditions 

(a)  G(£,  r')  = 0 at  0Q  (la) 

(b)  radiation  condition  at  r ->  <»  (lb) 

(c)  tip  condition  (G(r,  r^)  finite)  at  r = 0 (Ic) 


The  geometry  of  the  configuration  and  the  assumed  spherical  coordinate  system  are 
shown  in  Figure  1 . 
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Fig.  1.  Physical  configuration. 

The  solution  for  G(  r,  r')  can  be  written  as  the  sum  of  a free-space  field  Gq(£,  r') 
and  a scattered  field  G^(  r,  r*),  due  to  the  presence  of  the  cone,  as  follows:^ 

G(  r,  £')  = Gq{£,  r')  + Gg(  r,  £')  (2) 


where 


4ir  l£-£^  I 


= ^ — 7 Z E_  cos  m((t) -<()')  • 

Sirkrr  m=0 

. f ■‘'''*‘"dv(2v«lh<‘>(k,)  h'«(kr')  . 

-1/2-ioo  ' V ' ' V ' r(v-m  + l) 

Pj*"(cos  0)  Pj’^(cos  e')  Pj”’(-COS  0q) 
sin  [( V - m)ir]  P^”^(cos  0q) 


(3a) 


(3b) 


with 

£r,  - T > c =1,  m^O. 

0 2m  ' 

Here,  Pj^(x)  is  the  Legendre  function  of  order  v,  degree  (-m)  and  argument  x,  while 


Idi 
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z (x)  is  a spherical  Bessel  function 


/ \ ’ TT  X ry 

= '/  — ^v+1/2 


(x)  , 


= uUK 


with  Z (x)  denoting  a solution  of  the  conventional  Bessel  equation  (in  Eq.  (3b),  z^  = 

To  get  the  result  in  Eq.  (3b),  the  condition  (O+o')  < (20q  - it)  has  to  be  imposed  (Refer- 
ence 3).  This  restriction  on  (0+  0^)  excludes  from  the  range  of  applicability  that  domain 
which  contains  the  rays  reflected  from  the  sides  of  the  cone  according  to  the  laws  of 
geometrical  optics  (see  Figure  2). 


Diffracted 
wave  front  • 


-Incident  roy 


Diffracted  roy 


tt-ooA\  Domain 


/Cone  \ \ 


reflected  rays  . 


///.■// 


Reflected 

roy 


Fig.  2.  Ray-optical  domains . 

To  convert  the  representation  in  Eq.  (3b)  into  a form  that  permits  direct  Laplace 
inversion,  we  first  employ  the  change  of  variable 

V = - ^ + ix  (4) 


(The  Laplace  inversion  of  Gq  in  Eq.  (3a)  yields 

.1 

Gq(  £,  r';  t,t')  = [4ir  1 £-£'  ] ] 6(t  - t' ^ ). ) 

4 

Introducing  the  Hankel  function  product  formula 

nf^^kr)  Hj^^(kr')  = ^ J dw  + r'^  + 2rr'  cos  w) 


-X(w-'IT  ) 
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and  letting  y = iw,  one  transforms  Eq.  (3b)  into 

00  — 

G ( r,  r')  = f dyHj,^^(k  /r  + r + 2rr' cosh  y)  A(y)  (6) 

where  A(y)  = A(d,  q" is  given  by 
. 00 

A(y)  = y E cos  m(4> -<(>') 

Sir./rr'  rn=0 


/dx 


ixy  XTT 
xe  'e 


r(ix+m+i)  Jcos  9)P~7^^.^(cos  e^)P'7^^.^(-cos  Oq) 

sin[(-l/2+ix-m)Tr]p-*72+ix<^°®  ®o’ 


r(ix-m+ j) 


The  interchange  of  orders  of  integration  is  justified  by  the  absolute  convergence  of  the 
double  integral.  Now  let  k=is/c,  where  s is  positive  and  c is  the  speed  of  light  in  the 
medium  surrounding  the  cone.  Recalling  that 


H^^^iz)  = e'^*'"/^K  (z) 


V ■tri  V 

where  K^(z)  is  the  modified  Bessel  function,  and  furthermore 

1 


Kq(x)  = jf  d; 
we  obtain 


0^^  ' 


X > 0 


OC  00 

G { r,  r')  = f dy  B(y)  J d^ 
® -^0  f/c 


/?T?77 


where 


(7) 


(8) 


(9) 


B(y)  = [A(y)  + A(-y)] 


f = f(y)  = /r^+  r'^+  2rr'coshy 


Interchanging  the  orders  of  integration,  one  finds  after  simply  manipulations. 


Gj£.£')=  J di 

® -^0 


Q(e) 


(9a) 

(9b) 

(10) 


where 
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Q(4)  = 


, < r + r ^ 


= r“^^^dy-— . C|>r+r' 


(10a) 

(10b) 


.2  , T 

,ii  - f /c 


with 


a(|)  = cosh 


2,2  2 

c I - r - r 

2rr' 


/2 


] 


(10c) 


In  view  of  Eq.  (10),  which  is  itself  in  the  form  of  a Laplace  transform,  the  inver 


sion  of  G into  the  time  domain  is  trivial.  It  follows  that 

fi 


U(T-  l:^) 


(11) 


G (r,r';t,t')=  / dy  — 

where  U(x)  is  the  Heaviside  unit  function,  a s q(t)  and  t = t - t . This  solution  can  be 
further  simplified.  Substituting  Eqs.  (9a)  and  (6a)  into  Eq.  (11)  and  performing  an 
interchange  of  the  order  of  integration,  one  may  carry  out  the  integration  in  y by  utiliz 
ing  the  integral  representation  of  the  Legendre  function 

P , z,,  • (cosh  a)  = f dy  cos  xy 

-1/2+ix  ir  Q ^cosh  a - cosh  y 

Thus,  from  Eq.  (12)  and  properties  of  the  gamma  function,  we  obtain 

00 

Gg(£*  £';t.  i')  = ^-l/2+ix^‘^°®^ 


(12) 


1)  U(T  - £±^) 


(13) 


where 


cosh  a = 


2 2 2 z2 

c T - r - r 


2rr' 


oc 

» , , c r>  / j.  i'x  X tann  i 

F(x)=-—  >,  ’THiTHf 

2rr  -n=0 


X tanh  irx 
X 


* ss  Ti- : — . r »-*  j ' . — TT~\  t 


(14a) 


(14b) 


r(ix-m+l/2)  r(-ix-m+l/2)  p-m 


:i/2+ix<‘^°"®0^ 


The  expression  in  Eq.  (13)  yields  the  time -dependent  tip  diffracted  field  for  arbi 
trary  source  and  observation  points  provided,  however,  that  0+6  < 20^  - ir  . The 
solution  describes  a spherical  pulse  centered  at  the  cone  tip  and  arriving  at  the 


20 


ELECTROMAGNETICS 


observation  point  at  t = (r+  i')/c,  i.e.  , after  the  time  interval  required  for  the  incident 
pulse  to  travel  from  the  source  point  to  the  tip  and  for  the  diffracted  pulse  to  travel 
from  the  tip  to  the  observation  point  (Figure  2). 

2.  Closed  Form  for  On-Axis  Source  and  Small  Cone  Angle 

To  simplify  the  general  result,  we  assume  first  that  the  source  point  is  located 
on  the  cone  axis,  i.  e.  , Q'  = 0.  Then  Eq.  ( 13)  reduces  to 


G ( r,  r';  t,  t')  = 

4Trrr 


7 /q  '^^^t^»h^^Pl/2+ix<^°"®^Pl/2+ix<^°®*' 


P-l/2+ix(-"°®V„,.  r+r\ 

• P 

P-l/2+ix^‘'°®®0^ 

Furthermore,  we  specialize  to  small  cone  angles  0^  « tt  , for  which  the  following  ap- 
proximation applies:^ 


P-l/2+ix<-"°^%) 

P-l/2+ix<"°"  V 


^ 2 

cosh  trx  In  [{ — ^ — ) 1 


• ®0  ^ 


Eq.  (15)  is  then  reduced  to: 


G (r,r';t,t') 

s tr  - ^ 

4rr'ln[(— ^)^ 


• / dx 


xtanh  irx 


:oshirx  -l/2+ 


(coshQ)U(T-  (17) 


The  integrand  in  Eq.  (17)  is  a Mehler  transform.  By  employing  the  formula 


00 

, X tanh  irx  r->  / \ i->  / > _ 1 

^ coshtrx  ■ P- l/2+ix<'^>^-l/2+ix<^^  = 7T^w  + ^ ■ 


one  obtains  in  closed  form. 


G^(  r,  r';  t,  t')  - 


c 1 U(  T - ^ ^ ) 

ir  - 0„  , cosh  Q + cos  0 ' c 

4irrr'ln[ — = — ; ] 


or,  written  explicitly. 


Gg(  r,  z'\  t,  t')  - 


' u(T.i^) 
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where 


2.2  . /,2 

c T (r  + r ) 

4rr' 


dv^ 

= cos  ^ 


(19a) 


This  expression  is  remarkably  simple  and  quite  unexpectedly  so;  however,  it  is 
subject  to  the  restrictions  0 < 29^  - it,  9^  « tt  , mentioned  earlier.  It  can  be  shown 
that  the  solution  agrees  with  independently  derived  results  for  early-time  and  long- 
time responses,  thereby  confirming  its  validity  in  these  limiting  cases. 

Analogous  simple  results  can  be  developed  for  the  Neumann  (acoustically  hard) 
boundary  condition,  and  for  the  electromagnetic  vector  dipole  field  in  the  presence  of 
a perfectly  conducting  cone. 
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ANALYSIS  OF  SURFACE  WAVEGUIDE  DISCONTINUITIES 
E-W.  Hu  and  L.  Bergstein 

Using  a closed  waveguide  approach,  we  analyzed  recently^  the  scattering  of  an 
even  TE  surface  wave  mode  impinging  upon  a step  discontinuity  of  an  open  dielectric 
slab  waveguide.  We  report  here  the  results  for  the  same  problems  but  with  an  even 
TM  incident  surface  wave. 

2 

Marcuse  has  solved  the  scattering  problem  using  two  different  approximations. 
For  the  TE  case,  both  approximations  yield  the  same  answer  which  agrees  with  our 
results.  For  the  TM  case,  however,  the  results  of  the  two  approximations  begin  to 
diverge  as  the  waveguide  thickness  increases  beyond  a quarter  of  a wavelength,  with 
oi  e of  the  approximations  yielding  negative  radiation  losses  for  large  waveguide  thick- 
nesses. Our  approach  yields  correct  results  irrespective  of  the  waveguide  thickness. 
Moreover,  it  is  straightforward  and  applicable  to  multimode  and  other,  more  complex 
waveguide  structures.  It  has  the  additional  advantage  that  it  yields  information  for  the 
radiation  pattern  of  the  scattered  waves. 

A.  Formulation  of  the  Problem 

The  open  dielectric  slab  waveguide  with  a discontinuity  at  z = 0 is  shown  in  Figure 
1(a).  The  thickness  of  the  waveguide  is  2d^  to  the  left  of  the  discontinuity  and  2d^  to 
the  right.  The  dielectric  constants  in  the  various  waveguide  regions  are  given  by 


^rl"0' 


*^r2^0’ 


-dj  < X < +dj 


X > d. 


for  the  left  waveguide  (z  < 0),  and 


:r(x)=  { 


‘^r3^0’ 


^4^0' 


-d^  < X < +d^ 


X > d- 


for  the  right  waveguide  (z  > 0),  where  1.2,  3, 4)  is  the  relative  dielectric  permit- 
tivity and  Eq  is  the  permittivity  of  free  space.  It  is  assumed  that  e^^  ^ ^r3  ^ ^r4’ 

We  approximate  the  open  waveguide  by  the  corresponding  closed  structure  shown 
in  Fig,  1(b)  and  choose  the  distance  a^  of  the  perfectly  conducting  enclosure  sufficiently 
large  so  that  a further  increase  will  have  no  discernible  effect  on  the  results. 

Assuming  that  there  is  no  field  variation  in  the  y direction,  i.e,  , that  8/9y  = 0, 
we  can  decompose  the  fields  into  TE  and  TM  modes.  Moreover,  from  symmetry  con- 
siderations it  is  clear  that  instead  of  solving  the  waveguide  structure  of  Fig.  1(b)  we  can 
deal  with  the  simpler  geometries  shown  in  Figs.  2(a)  and  2(b)  for  odd  and  even  TM  modes, 
respectively. 


ly  conducting  bisecting  wall;  (b)  shows  the 
reduced  waveguide  for  the  even  TM  case 
with  a perfect  electrically  conducting  bisect- 
ing wall. 
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The  mode  functions  and  dispersion  equations  for  the  various  waveguide  regions 
are  discussed  and  listed  in  Reference  1.  They  will  therefore  not  be  repeated  here. 

For  simplicity,  we  assume  that  only  the  fundamental  surface  modes  are  supported  by 
the  waveguides  on  both  sides  of  the  discontinuity.  If  a harmonic  surface  wave  with  a 
time  variation  exp(jwt)  and  a transverse  magnetic  field  of  unit  amplitude  is  incident 
from  -<»  towards  the  discontinuity  at  z=  0,  the  total  transverse  fields  for  z < 0 after 
scattering  are  given  by 

-jP  Z jP  Z 00  jP  z 

H (x.z)=(e  - ^rn®  ^ ‘ '^n 

^ n=2 


-i(3  z ip  z b jP  2 

m n=  2 n 


Similarly,  the  total  transverse  fields  for  z>0  are  given  by 

-jP  z _ * _ 'jP  ^ 

H (x,z)  = A e P $ (x)  + y B 4j  (x)  e 
y P P q=2  ^ ^ 


-jPqZ 


In  these  equations,  normalized  surface  mode  functions, 

and  di  (x)  are  the  non -surface  mode  functions;  a , A , b and  B are  the  transverse 
“q ' ' m p n cj 

field  amplitudes  for  the  corresponding  waves;  yj^i  Y^,  y^  and  Y^  are  the  longitudinal 

wave  impedances;  and  P , P , P and  P are  the  longitudinal  propagation  constants. 

^ m p n q 

At  the  junction  z=  0,  the  transverse  fields  must  be  continuous.  This  leads  to  the  follow- 
ing two  sets  of  equations: 

n=2  q=2 


■'m  n=2  n p q=<iq 


Using  the  orthogonality  properties  of  mode  functions,  Eqs.  (4)  and  (5)  reduce  to 


(5) 
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fc.,* 


{<4>  . — 

*-  tn  £ n 


_ Pm  1 - I 1 

(f)  > + -3 < (j)  , (()>+<())  , 4^  ^ + "3 — <l>  1 — 4^  ^ I' 


**  r P P "i 

+ V ■{  < i|j  I — 4>  > + 3 — < , — (j)  > + < 4j  I — 4^  > + 75 — < 4j  ) — 4^  ^ ^ b 

n=2^  " "R  P ^ " ^L  P ^ ^ '"n-E^'^qJ  n 


= <4>  . — 4>  > - -?^^  <<}>  ,—  4)>  + <(t)  , — 4j>--^<4>  , — 4;  > 


with  m = p = 1 and  q=2,3,4,...,oc.  We  solve  this  infinite  system  of  linear  equations  for 
the  amplitudes  and  by  truncating  the  higher  evanescent  modes  and  solving  the 
resulting  finite  set  of  linear  equations  numerically.  The  truncation  process  only  results 
in  a negligibly  small  error. 


B.  Numerical  Results 


We  solved  numerically  the  discontinuity  problem  (i.e.  , Eqs.  (6))  for  the  same  set 

2 

of  parameters  as  those  reported  by  Marcuse.  Thus,  we  assumed  that  the  refractive 
indices  n.  = Ve  T of  the  waveguide  media  are  n.,  = n.=  1 and  n,  = n,=  1.432  and  have 
determined  the  scattered  fields  as  a function  of  the  normalized  wave  thickness  kd^  = 
2Trdj/X  with  the  thickness  ratio  d^/d^  as  a parameter.  The  results  for  the  radiation 
losses  are  shown  in  Fig.  3 for  the  case  when  d^/d^  = 0.  5.  The  two  dashed  curves  show 


-»kd| 

Fig.  3.  Relative  radiation  losses  AP/P  due  to  the  scattering 

of  surface  waves  by  the  step  discontinuity  for  the  even 
TM  case  with  ii2  = n3=  1.432,  03  = 04  = 1,  and  d2/d  j = 0.5. 
The  losses  are  shown  as  a function  of  the  normalized 
waveguide  thickness  kd^  = 2'ird2/X.  Solid  curve  shows 
the  result  obtained  in  this  work,  the  two  dashed  curves 
represent  the  results  reported  by  Marcuse. 
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the  results  reported  by  Marcuse  (using  two  different  approximations);  our  results  (for 
0 < kdj  < 3)  are  shown  by  the  solid  curve.  As  expected,  the  radiation  losses  generally 
decrease  with  increasing  waveguide  thickness.  We  observe  that  all  three  results  show 
very  good  agreement  for  kd^  values  in  the  range  between  0.7  and  1.7.  For  values  of 
kdj  below  0.7,  our  results  show  somewhat  higher  radiation  losses.  For  kdj  values 
above  1.7,  Marcuse's  two  results  begin  to  diverge;  one  of  the  approximations  yields 
meaningless  negative  radiation  losses  for  kdj  values  greater  than  2.  0,  while  the  other 
approximation  shows  higher  radiation  losses  than  those  found  by  our  approach.  The 
radiation  loss  minimum  appears  to  occur  at  kdj  « 2.7.  Marcuse  shows  a loss  minimum 
of  about  2%;  our  results  show  a loss  minimum  below  0.  5%. 
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STRAY  FIELD  OF  A SLOTTED  MICROSTRIP  UNE 
I.  Palocz 

In  a number  of  recent  computer  memories,  microstrip-type  lines  are  used.  In 
an  attempt  to  gain  a better  understanding  of  the  behavior  of  these  lines,  the  present 
author  conducted  several  previous  studies  ^ In  this  contribution,  the  magnetic  field 
at  the  ground  plane  and  at  the  air  boundary  is  investigated.  This  calculation  is  per- 
formed in  connection  with  the  stray  fields  occurring  in  memories.  It  should  be  men- 
tioned that,  in  present  day  experimental  memories,  the  so-called  bit  lines  are  very 
close  to  the  ground  plane,  and  the  stray  magnetic  fields  do  represent  a limit  in  packag- 
ing the  elements  closer  and  closer;  the  effective  area  of  one  storage  element  is  deter- 
mined by  this  field. 

In  this  study,  we  calculate  the  magnetic  field  in  the  x direction,  which  is  of  prac- 
tical interest.  For  notation,  see  Figure  1.  Note  that  is  proportional  to  E^  and  that 
the  proportionality  factor  is  The  calculation  is  a straightforward  extension 

of  this  author's  previous  calculation,^  where  the  current  distribution  has  been  analyzed 
by  using  a finite  Hilbert  transformation."^  Since  the  current  distribution  is  known,  the 
magnetic  field  is  simply  given  by  (see  Eq.  (36)  of  Reference  2): 


H = — / i(x') 


1 ( 1 ; 
(x-x')^+  h^  (x+x')^+  h'^j 


(1) 


where  the  notation  is  evident  from  Figure  1.  (By  multiplying  all  the  dimensions  by  w/2 
one  obtains  results  for  the  line  of  width  w.  rather  than  for  the  line  of  width  "two.  ") 


The  slotted  line  will  now  be  compared  with  an  unslotted  one  (see  the  right  upper 
part  of  Figure  1.)  For  the  latter,  the  ground  plate  is  infinite  and  the  upper  electrode 
is  semi -infinite,  and  the  coordinate  system  is  so  chosen  that  x equals  zero  at  the  edge 
of  the  semi -infinite  electrode.  The  stray  field  of  tliis  Une  has  long  ago  been  caRulated 
in  a rather  ingenious  manner  by  Hermann  Von  Helmholtz  and  Gustav  Kirchhoff. 

The  parametric  representation  of  the  coordinate  x and  y are  given  by 


^ = — (1  - t-  e’*^  cos  s) 
b n 


(2) 


^ — (s  - e *■  sin  s) 

b IT 

Letting  s = const,  , we  obtain  the  equipotential  lines,  while  t=  const,  yields  the  lines  of 
force. 

The  potential  of  the  upper  plate  is  zero  (s=  0);  it  is  evident  from  Eq.  (2)  that  for 
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Fig.  1.  a)  versus  x for  a semi-infinite 
strip  above  ground;  b)  versus 
x/b  for  a slotted  line. 

this  case  y=  0.  For  t=  0,  x=  0,  and  for  all  positive  and  negative  values  of  t the  value 
of  X is  negative;  the  electrode  is  semi -infinite . For  the  ground  plane  s = ir ; as  evident 
from  Eq.  (2)  x assumes  all  the  negative  and  positive  values  as  t varies  from  -»  to  oc. 
Since  the  field  strength  at  the  ground  plane  is  of  interest,  one  writes  for  the  neighbor- 
hood of  the  ground  plate  s = ti-  - As,  where  A is  small;  omitting  quadratic  and  higher 
order  terms  in  the  expansion  of  cos  As  and  sin  As,  one  gets 


X 

b 


(3a) 


b 


(3b) 


Hence  the  magnitude  of  the  field  strength  H , normalized  to  H , is 

X max 

H 1 A . 

X _ 1 As  _ 1 

H H Ay  ^ 7^  -t 

max  max  ' 1 + e 

For  any  value  of  t,  one  gets  x/b  from  Eq.  (3a)  and  the  corresponding  normalized  field 
from  Equation  (4).  The  result  of  this  calculation  is  shown  in  the  curve  a)  of  Figure  1. 
The  slotted  case  is  plotted  in  curve  b)  of  Fig.  1 for  the  case  of  a = l/3,  h=  0.  08. 
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PHYSICAL  PROCESS  OF  SAW  SCHOTTKY  DIODE  MEMORY  CORRELATOR 
W-C.  Wang 

It  is  attempted  here  to  explain  the  physical  process  involved  in  SAW  diode  memory 
correlators,  based  on  the  published  experimental  result  of  Ingebrigtsen,  Cohen  and 
Mountain.^  They  obtained  their  correlation  read-out  by  observing  the  variations  of  con- 
volution output  at  frequency  2w  between  two  r.f.  pulses,  one  long  and  gne  short,  at 
frequency  u).  Figure  2 of  Ref.  1 is  redrawn  here  as  Figure  1.  It  shows  the  Schottky 


Fig.  1.  Schematic  of  the  experimental  configuration  with 
a photograph  of  the  Schottky -diode  matrix. 

diode  matrix  (inserts  A and  B)  and  the  diode  - LiNbO^  memory  structure. 

The  spherical  platinum  siUcide  Schottky  diodes  are  of  5pm  in  diameter.  The 

periodicity  between  the  diodes  is  12.7ps.  Each  diode  is  overlayed  with  a square  of 

2 

Cr/Au  with  dimensions  w 10  x 1 0 pm  . 

For  clarity,  we  also  single  out  the  MOS  portion  of  the  device  which  is  indicated 
by  the  dotted  square  and  redrawn  in  insert  A. 

The  MOS  portion  of  the  structure  is  found  to  be  essential  in  the  operation  of  the 
diode  memory  correlator.  This  point  will  become  clearer  as  we  proceed. 

The  amplitude  variation  of  the  convolved  signal  versus  applied  d.c.  plate  pulses 

of  short  duration  (150  nsec)  obtained  by  Ingebrigtsen  et  al.  , is  redrawn  in  Fig.  2,  where 

the  plate  pulse  is  used  to  forward  bias  the  Schottky  diodes.  When  the  Schottky  diode  is 

forward  biased,  the  injected  carriers  will  distribute  themselves  over  the  entire  Cr/Au 
2 

< layer  (lOx  10pm  ) and  serve  as  a reverse  bias  for  the  MOS  structure  (insert  A). 

The  shape  of  the  convolution  curve  in  Fig.  2 can  be  qualitatively  explained  based 

^ 2 3 

on  the  theory  developed  by  Gautier  and  Kino  for  reverse-biased  MOS  structure.  ’ 
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PLATE  PULSE  AMPLITUDE 

Fig.  2.  Normalized  convolution  output  as  a 
function  of  plate  pulse  voltage. 

(Figure  3 of  Reference  1.) 

Let  us  assume  that  when  no  bias  voltage  is  applied,  the  n-type  MOS  structure  is 
at  the  flat  band  condition.  When  a small  voltage  is  applied  to  the  structure,  as  shown 
by  portion  I of  the  curve,  the  depletion  width  widens  and  the  convolution  output  increases. 
The  reason  is  that  the  sonic  induced  field  can  now  better  modulate  the  depletion  width. 
When  the  depletion  width  further  increases  as  shown  in  portion  II  of  the  curve,  it  ef- 
fectively increases  the  gap  spacing  between  the  semiconductor  and  the  piezoelectric  sub- 
strate, thus  it  decreases  the  piezoelectric  coupling  and  decreases  the  convolution  out- 
put as  well.  The  portion  III  of  the  curve  is  the  part  stated  by  Ingebrigtsen  et  al.  to  be 
most  sensitive  to  the  voltage  variations.  It  is  the  place  where  the  population  of  holes 
starts  to  surpass  the  population  of  electrons,  i.e.  , the  inversion  layer  begins  to  form 
there.  When  the  inversion  takes  place,  the  transverse  d.c.  acoustoelectric  voltage 
reverses  its  sign  and  begins  to  forward  bias  the  MOS  diode.  Its  net  effect  slightly 
reduces  the  reverse-biasing  voltage  and  results  in  a slight  increase  in  the  convolution 
output.  When  the  reverse-bias  still  increases  further  as  it  is  shown  in  portion  IV  of 
the  curve,  it  thickens  the  inversion  layer.  Since  the  inversion  layer  will  shield  the 
semiconductor  from  the  piezoelectric  field,  the  convolution  output  decreases  quite 
rapidly. 

The  explanation  on  the  convolution  output  given  so  far  is  made  under  the  assump- 
tion that  the  MOS  diode  is  under  continuous  reverse  biasing,  but  no  initial  charges  are 
stored  on  the  Cr/Au  film.  Supposing  that  there  are  initial  charges  present  there,  then 
in  effect  the  stored  electrons  would  increase  the  reverse-bias  voltage  by  AVg.  That  is, 
the  stored  charges  would  shift  the  set  of  convolution  curves  obtained  by  Ingebrigtsen 
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et  al.  to  the  left  by  ^Vg.  The  solid  lines  in  Fig.  3(a)  are  two  of  their  original  convolu- 
tion curves  with  plate  pulse  width  of  150  and  50  nsec,  the  dotted  lines  represent  the 
shifted  curves.  The  differences  in  convolution  output  b 'tween  the  two  sets  of  curves 
are  plotted  in  Figure  3(b).  This  difference-convolution  voltage  Av  should  be  equal  to 
the  correlation  read-out  obtained  experimentally  by  Ingebrigtsen  et  al.  , since  the  AVg 
shift  in  Fig.  3(a)  has  been  adjusted  corresponding  to  the  amount  of  charge  stored  in 
their  experiment.  The  correlation  read-out  (normalized)  observed  by  Ingebrigtsen  et  a 
is  redrawn  in  Figure  3(c). 


PLATE  PULSE  amplitude 
(o) 


- 200  -400  -600  -800 

PLATE  PULSE  amplitude 
(c) 

Fig.  3.  (a)  and  (c)  are  the  plots  of  convolution  output  and  correlation  dip 

as  a function  of  plate  pulse  height.  The  stored  pulse  and  the 
reading  pulse  length  are  of  250nsec  long.  (Figure  3 of  Reference 
1).  Figure  3(b)  plots  the  differences  in  convolution  outputs  (shown 
in  3(a))  between  the  solid  curves  and  the  dotted  curves. 

In  comparing  Figs.  3(b)  and  3(c),  one  observes  the  apparent  agreement.  The  dif- 
ferences in  magnitude  between  Figs.  3(b)  and  3(c)  are  introduced  by  the  fact  that  they 
have  normalized  their  experimental  data.  Excellent  agreement  has  been  obtained  by 
taking  into  account  the  normalization  factor. 

In  conclusion,  we  have  shown  that  the  MOS  portion  of  the  structure  does  play  an 
essential  part  in  the  diode  memory  correlator.  This  conclusion  is  expected  to  be  gen- 
eral and  applies  to  other  published  types  of  diode  memory  systems  as  well.  The 
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Schottky  diode  serves  as  a carrier  injector  and  the  amount  of  charge  injected  is  modula- 
ted by  the  acoustic  write-in  signal.  As  far  as  the  correlation  read-out  is  concerned, 
the  contribution  of  the  Schottky  diode  to  the  convolution  signal  is  small,  possibly  due 
to  the  simple  fact  that  it  only  occupies  one-eighth  of  the  total  surface  area.^  It  is  also 
clear  that  in  this  particular  read-out  system  the  maximum  correlation  output  appears 

at  the  inversion  region,  which  would  not  be  the  case  for  other  types  of  read-out  sys- 
5-8 

terns.  In  the  second  paper  of  Ingebrigtsen  and  Stern  that  have  modified  their  memory 
structure,  replacing  the  Cr/Au  overlay  by  a high-resistive  polysilicon  fihn . The 
modified  structure  is  more  complicated  to  analyze,  since  the  contribution  of  the  resis- 
tive polysilicon  layer  may  not  be  ignored. 
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EXPERIMENTS  ON  SAW  DIODE  CORRELATOR 
W-C.  Wang 

Utilizing  nonlinear  interaction  of  acoustic  surface  waves  to  perform  the  function 
of  correlation  has  been  under  intensive  study  for  several  years.  Advances  made  in  this 
period  have  been  very  rapid,  especially  in  the  direction  of  the  memory  correlator 
We  have  done  some  experimental  studies  on  the  Si-LiNbO^  diode  correlator.  Most  of 
the  experimental  results  presented  here  were  obtained,  using  Thomson-CSF  vidicon 
diode  arrays,  while  the  author  was  on  leave  at  Thomson-CSF.  Similar  results  have 
been  obtained  at  the  Polytechnic  using  RCA  4532  vidicon  diode  mosaic.  We  describe 
first  the  inter-relationships  among  convolutuon,  stored  charge  and  correlation,  and 
then  the  effect  of  d.c.  pulse  biasing  on  the  correlation  output. 

A.  Inter-relationships  Among  Convolution,  Correlation  and  Charge  Storage 


Figure  1 is  the  experimental  configuration  which  is  the  same  as  that  reported  by 
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Fig.  1.  Our  experimental  configuration. 

Center  frequencies  of  A and  B, 

60MHz  and  R,  120MHz. 

Maerfeld  et  al.  P-n  diode  arrays  are  used  in  the  structure.  A brief  description  on  the 
diode  mosaic  is  given  here.  The  original  Si  wafer  is  of  lOD-cm  resistivity  and  n-type. 
On  its  surface  a 5000  A SiO^  layer  is  grown.  Then  through  an  etching  process,  arrays 
of  5pm  diameter  holes  are  made  on  the  SiO.,  film.  At  the  hole  sites  the  p-n  diode 
arrays  are  formed.  The  carrier  density  on  the  P-region  is  about  10  atoms/cm^.  The 
diodes  are  then  overlayed  with  an  8x8(pm)^,  P^  Si  layer  of  1 pm  thick.  The  periodicity 
of  the  diodes  is  12.5  pm. 

On  the  LiNbOj  substrate,  as  shown  in  Fig.  1,  a pair  of  60MHz  and  one  120  MHz 
interdigital  transducers  are  deposited  at  their  respective  locations  A,  B and  R. 

Let  a signal  of  frequency  u>  be  applied  to  the  transducer  A exciting  a surface  wave 
propagating  in  the  x direction,  charge  would  be  stored  due  to  the  action  of  piezoelectric 
field  associated  with  the  surface  wave  forward-biasing  the  diode  arrays.  The  amount 
of  charge  stored  can  be  controlled  by  many  factors  such  as  the  diode  characteristics. 
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the  pattern  of  the  diode  arrays  and  their  overlays,  the  signal  amplitude,  the  d-c  biasing 
condition  and  the  repetition  rate  of  the  applied  signals.  If  another  signal  of  the  same 
carrier  frequency  w is  applied  to  transducer  B exciting  a wave  propagating  in  the  -x 
direction,  then  due  to  nonlinear  mixing,  two  signals  are  produced.  One  is  the  common- 
ly known  convolution  signal  at  the  sum  frequency  2a>  and  the  other  at  the  difference  fre- 
quency, which  is  independent  of  time  but  with  a spatial  variation  of  exp(2kx).  Both  the 
linear  and  nonlinear  field  induced  by  the  two  oppositely  propagating  waves  will  affect 
the  amount  and  the  pattern  of  the  charge  storage.  However,  the  concern  here  is  the 
retention  of  both  the  amplitude  and  phase  information  of  the  spatially  varying  nonlinear 
signal,  since  it  is  the  signal  which  will  later  interact  with  reading -pulse  to  produce 
correlation  output.  Thus,  it  is  important  to  know  the  conditions  under  which  one  can 
optimize  the  storage  of  a spatially  varying  signal. 

The  stored  charge  will  act  as  a reversed  bias  for  both  the  diode  arrays  and  the 
MOS  portion  of  the  structure.  It  has  been  known  that  the  convolution  signal  is  sensitive 
to  the  biasing  condition,  i.e.  , by  observing  the  variation  of  convolution  output,  one  may 
obtain  the  information  on  charge  storage.  This  point  has  been  confirmed  by  the  follow- 
ing experiment. 

Write-in  signals  of  large  amplitudes  20  volts,  p-p  are  first  applied  to  transducers 
A and  B.  The  observed  convolution  output  is  quite  large,  actually  out  of  scale  (Fig.  2) 
and  then,  at  time  t^  the  signal  amplitudes  are  suddenly  switched  from  20  volts  to  6 volts, 
the  convolution  output  is  observed  to  drop  to  level  I at  and  then  slowly  increase  to 
level  II.  This  rising  time  constant  can  be  related  to  the  leakage  rate  of  the  stored 
charge.  Under  the  same  experimental  condition,  but  1.2msec  after  writing  signals  ap- 
plied to  A and  B,  a low  level  reading -pulse  is  applied  to  transducer  R(120MHz)  to  ob- 
serve the  changes  in  correlation  output  corresponding  to  the  sudden  change  of  writing 
signal  amplitudes  applied  at  transducers  A and  B.  Figure  2(b)  shows  the  changes  as  a 
function  of  time.  At  t^  the  correlation  output  starts  to  decrease,  and  the  time  it  requires 
to  reach  steady-state  is  seen  to  be  closely  related  to  that  in  Figure  2(a).  In  fact,  they 
have  the  same  time  constants.  Figures  3(a)  and  3(b)  are  obtained  under  the  same  con- 
dition as  Figs.  2(a)  and  2(b)  except  that  the  ambient  temperature  is  9°C  for  Fig.  3 and 
18°C  for  Figure  2.  It  is  noted  that  the  time  constant  at  low  temperature  is  larger  than 
that  at  high  temperature,  which  is  expected  from  the  temperature  characteristic  of  a 
reversed  biased  diode. 

It  is  of  importance  to  write  down  the  correlation  output  we  have  observed  in  the 
above  experiment.  Let  the  signals  applied  to  transducers  A and  B be  A e'^^^  and  B 
then  during  the  write-in  process  the  stored  signal  is 
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Ambient  temp. 
18  *C 
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Fig.  2.  (a)  Convolution  and 

(b)  Correlation,  as  a 
function  of  time. 
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Fig.  3.  (a)  Convolution  and 

(b)  Correlation,  as  a 
function  of  time. 
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After  the  application  of  the  read-out  signal,  Re^^,  one  obtains  the  following  correla- 
tion output. 


P^e 


jkL 


/c( 


L - 2x 


) R(t 


•^)  eJ2“‘  d(^) 

V ' V 


(2) 


where  P^  and  P^  are  constants,  L is  the  path  length  between  transducers  A and  B,  and 
T represents  the  duration  of  the  shorter  signal. 

Equation  (1)  indicates  that  signal  A has  correlated  with  signal  B through  time 
integration  during  the  process  of  the  charge  storage. 

Equation  (2)  indicates  that  the  reading  signal  R has  correlated  with  the  stored 
signal,  C(2x/v)  through  spatial  integration  during  the  read-out  process.  The  output 
represented  by  Eq.  (2)  is  a complicated  one.  However,  it  can  be  simplified  and  has 
practical  importance  under  the  following  two  conditions. 

(1)  If  signal  R is  of  very  short  duration  and  can  be  approximated  by  6(t),  then 
the  read-out  signal  truly  represents  the  correlation  between  signals  A and 
B.  However,  it  should  be  pointed  out  that  since  the  time  integration  is  per- 
formed by  charge  accumulation,  the  time  rate  of  charge  build-up  is  required 
to  be  linear.  Otherwise,  discrepancies  will  be  introduced. 
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(2)  If  the  signal  B is  of  very  short  duration,  C(-2x/v)~  A(-2x/v),  then  the  read- 
out signal  represents  the  correlation  between  signals  A and  R.  Here  the  cor- 
relation process  is  performed  through  spatial  integration.  It  should  also  be 
pointed  out,  however,  that  a time  scaling  factor  of  one-half  is  introduced  in 
this  type  of  reading  and  writing  process.  The  time  scaling  factor  can  be 
removed  by  using  layered  structure,  LiNbO^-Si-BGO. 

For  the  experimental  result  presented  in  this  write-up,  neither  the  duration  of 
signal  R nor  the  duration  of  signal  B is  made  very  short,  i.e.  , the  correlation  output 
described  here  is  a complicated  one.  The  reason  for  not  using  a very  short  reading  or 
writing  pulse  is  due  to  the  fact  that  we  prefer  to  have  large  signal  output  in  conducting 
various  studies  of  this  device. 

The  write-in  process  adopted  here  is  believed  to  be  the  most  efficient  one,  since 
the  writing  pulse  is  applied  to  the  transducer  and  its  amplitude  does  not  have  to  be 
very  large  in  comparison  with  other  write-in  methods  used  by  Bers  and  Cafarella,  by 
Ingebrigtsen,  Stern,  Mountain  and  Cohen. 

B.  Effect  of  d-c  Pulse  Biasing  During  Writing  and  Reading  Intervals 

We  have  observed  that  both  the  signal  amplitudes  and  the  rate  of  charge  build-up 
are  affected,  when  a d-c  pulse  is  applied.  The  extent  of  the  effect  has  been  found  to 
be  strongly  dependent  on  the  writing  and  reading  pulse  amplitudes,  and  also  on  the  time 
and  duration  during  which  the  d-c  bias  is  applied.  The  first  case  we  will  consider  is 
when  the  d-c  pulse  is  applied  during  the  time  that  the  charge  is  being  stored,  and  then 
we  will  consider  the  effect  of  the  d-c  bias  when  it  is  applied  during  the  time  that  the-"''~"'~- 
correlation  is  being  read  out. 

The  triangular  shaped  signal  indicated  by  I in  Fig.  4 is  the  convolution  output 
between  two  rectangular  pulses,  both  of  4psec  in  duration  and  4 volts  in  amplitude. 

The  pulse  of  large  amplitude  corresponds  to  the  reading  pulse  radiated  through  the  air. 

The  correlation  output  which  should  appear  after  the  reading  pulse  at  position  II  is  too 
small  to  be  recognized.  When  a pulsed  d-c  voltage  is  applied  to  forward-bias  the  diode 
array  during  the  time  when  the  charge  storage  is  in  process,  shown  in  Fig.  4(b),  the 
correlation  output  is  greatly  enhanced.  In  fact,  its  amplitude  is  comparable  to  that  of 
the  convolution  signal.  The  observed  enhancement  is  due  to  the  fact  that  in  Fig.  4(a) 
the  writing  signal  is  too  small  to  give  appreciable  charge  storage,  but  in  Fig.  4(b)  with 
the  help  of  forward  biasing  the  operating  point  on  the  diode  characteristic  is  shofted 
and  more  charge  is  stored.  It  is  also  noted  that  the  convolution  output  is  increased  in 
Figure  4(b). 

Figure  5 describes  another  facet  of  the  experiment.  It  indicates  the  effect  of  for- 
ward biasing  on  the  time  rate  of  charge  build-up.  Here  the  repetition  rate  of  the  writing 
signal  is  less  than  3 pps,  which  is  much  slower  than  the  reading  pulse  rate  a:  100  pps. 
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Effect  of  d-c  pulse  biasing  on  the  signal  outputs. 
Position  I is  convolution  and  II  is  correlation. 
Input  signal  amplitudes  4'^.  Reading  pulse  height 
8v.  (a)  No  d-c  bias  applied,  (b)  Forv.'ard-bias 

applied  during  writing  process.  Scale  lOO'^/sec. 


No  external  dc  biasing 


Forward-  pulse 
biasing  applied 


I sec/div 

Effect  of  forward  biasing  during  writing  process 
on  the  time  rate  of  charge  build-up. 
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Oscillograms  a,  b and  c in  Fig.  5 correspond  to  the  condition  at  which  no  d-c  bias  is 
applied.  Oscillograms  a',  b'  and  c'  are  taken  under  the  condition  that  the  diodes  are 
forward  biased  when  the  writing  process  is  in  progress.  The  effect  of  reading  pulse 
height  on  the  rate  of  charge  build-up  is  also  observed.  The  reading  pulse  heights  in 
oscillograms  a and  a'  are  of  18  volts,  in  b and  b',  14  volts  and  in  c and  c' , 11  volts. 

We  notice  that  the  common  feature  in  the  three  sets  of  oscillograms  is  the  disappear- 
ance of  the  fast  time  decaying  constant  when  forward-biasing  is  applied.  This  is  be- 
cause more  charge  is  stored  when  the  diode  is  forward  biased.  When  the  reading  pulse 
is  applied,  due  to  the  presence  of  the  stored  charge  the  diode  operating  point  is  shifted 
to  the  lower  part  of  the  diode  characteristic  and  gives  a much  longer  time  constant. 

This  special  feature  is  expected  to  be  useful  when  multiple  writings  are  employed  or 
when  the  correlation  process  is  performed  through  time  integration. 

Now  let  us  consider  the  case  where  the  d-c  bias  is  applied  at  the  time  interval 
when  the  correlation  output  is  observed,  as  indicated  in  Figure  6.  The  extent  of  the 
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Fig.  6.  Effect  of  d-c  pulse  biasing  (applied  during  read-out  interval) 
on  the  signal  outputs.  Position  I indicates  convolution  output 
and  II  correlation  output. 

biasing  effect  depends  on  the  amplitude  of  writing  signal.  The  amplitude  of  writing  signal 
for  oscillograms  a,  b and  c is  of  9 volts.  The  first  triangular  signal  corresponds  to 
the  convolution  output;  the  last  signal  corresponds  to  the  correlation  read-out.  We  ob- 
serve that  the  correlation  signal  is  reduced  in  Fig.  6(b)  when  the  structure  is  forward- 
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biased  and  enhanced  in  Fig.  6(c)  when  the  structure  is  reversely  biased.  Oscillograms 
d,  e and  f are  obtained  under  the  same  condition  as  that  of  a , b and  c except  that  the 
writing  signal  amplitude  is  increased  to  22  volts.  In  this  case  we  noted  that  the  cor- 
relation output  is  only  slightly  affected  by  the  d-c  biasing.  We  further  noted  that  the 
effect  of  forward-biasing  increases  the  correlation  output  and  the  effect  of  reverse- 
biasing decreases  the  correlation  output,  which  is  contrary  to  what  we  have  observed 
in  oscillograms  a,  b and  c. 

In  Fig.  7 we  plot  the  amplitude  of  correlation  read-out  as  a function  of  the  reverse- 


Fig.  7.  Correlation  output  as  a function  of  reverse  d-c  biasing. 

Bias  voltage  is  applied  during  read-out  time  interval. 

Writing  signal  amplitude  = 6 volts.  Reading  signal  am- 
plitude = 9 volts. 

bias  voltage  for  writing  signal  amplitudes  of  6 volts.  Here  the  correlation  output  first 
reaches  a maximum  and  then  decreases.  From  these  experiments  of  Figs.  6 and  7 it 
is  clear  that  the  effectiveness  of  biasing  depends  on  both  the  writing  and  reading  signal 
amplitudes.  In  general,  however,  it  is  true  that  for  small  writing  signal  and  moderate 
reading  strength,  the  correlation  output  vs.  reverse  bias  height  follows  the  curve  des- 
cribed in  Figure  7. 

The  author  wishes  to  thank  C.  Maerfeld  for  discussions  and  for  using  his  experi- 
mental set-ups. 
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RAYLEIGH  WAVE  FIELDS  EXCITED  BY  AN  ACOUSTIC  BEAM  INCIDENT  FROM  A 
UQUID  NEAR  THE  RAYLEIGH  CRITICAL  ANGLE 

Y.L.  Hou  and  H.  L.  Bertoni 

An  acoustic  beam  reflected  into  a liquid  from  the  surface  of  a solid  at  or  near  the 

angle  of  phase  matching  with  the  free-surface  Rayleigh  wave  exhibits  features  not  ex- 

12  3 

plained  by  geometrical  acoustics.  ’ ’ The  beam  center  is  shifted  from  the  geometrical 
optics  position  and  its  profile  has  variations  different  from  that  of  the  incident  beam. 
Based  on  the  properties  of  the  reflected  beam,  it  has  been  established  that  these  effects 
result  from  the  excitation  and  re -radiation  of  the  Rayleigh  wave,  which  is  of  the  leaky 

4 

type  due  to  the  presence  of  the  liquid. 

Although  it  is  known  indirectly  that  the  incident  beam  in  the  fluid  excites  the 
Rayleigh  wave,  the  elastic  fields  in  the  solid  have,  in  fact,  never  been  computed.  The 
work  described  here  establishes  that  the  elastic  fields  are  in  the  form  of  a Rayleigh 
wave  whose  amplitude  varies  along  the  surface.  Besides  completing  the  description  of 
the  beam  shift  phenomena,  this  study  allows  computation  of  the  power  flow  in  the  solid 
when  viscous  absorption  or  loss  is  present.  Previously,  power  flow  was  computed  only 
for  lossless  solids  using  conservation  of  energy  arguments.  In  the  lossless  case  it  was 
found  that  up  to  80  percent  of  the  incident  power  could  be  transferred  to  the  Rayleigh 
wave  prior  to  re-radiation. 

The  power  flow  in  the  solid,  and  the  affect  of  loss  on  it,  are  of  interest  in  non- 
destructive evaluation  (NDE)  of  solids.  Loss  in  polycrystalline  materials  is  strongly 

5 

influenced  by  the  grain  size.  Thus,  loss  measurements  can  be  used  to  determine 

grain  size.  Grain  size  is  of  particular  interest  in  structural  materials  (aluminiun, 

steel,  etc.)  since  it  is  a major  factor  determining  the  materials'  strength.  Detecting 

grain  size  near  the  surface  of  a solid  from  observation  of  the  phase  of  reflected  plane 

6 7 

waves  has  been  proposed  as  an  NDE  technique.  ’ Simpler  amplitude  measurements 

for  bounded  beam  reflection  have  also  been  shown  to  be  applicable  to  NDE  determination 
8 9 

of  grain  size.  ’ Knowledge  of  the  power  flow,  in  the  solid  due  to  a bounded  incident 
beam,  helps  to  understand  the  features  of  the  reflected  beam  that  would  be  used  in  NDE 
amplitude  measurements. 

A.  Approximations  for  Plane  Wave  Transmission  and  Reflection  Coefficients 

In  the  presence  of  a liquid  at  the  surface  of  a solid,  the  free  surface  Rayleigh 
wave  is  perturbed  into  a leaky  wave,  for  the  usual  case  when  the  wavenumber  k in  the 
liquid  is  greater  than  the  shear  and  longitudinal  wavenumbers  k^  and  k in  the  solid. 

For  exp(-ifa)t)  time  dependence,  the  leaky  wave  fields  vary  along  the  surface  shown  in 
Fig.  1 as  exp [i  k^  x]  where 
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Fig.  1.  Gaussian  beam  incident  on  a 
liquid-solid  interface. 

= P + i a ( n 

Here,  P is  the  phase  constant  of  the  leaky  wave  and  is  close  to  the  phase  constant  k^^  of 
the  free-surface  Rayleigh  wave.  The  attenuation  constant  a of  the  leaky  wave  results 
from  the  fact  that  the  Rayleigh  wave  sheds  power  into  the  liquid  at  an  angle  6^  = sin 
(P/k),  as  it  propagates  along  the  surface. 

Because  the  leaky  Rayleigh  wave  is  a resonant  solution  satisfying  boundary  condi- 
tions at  the  interface,  the  wavenumber  k^  will  appear  as  a pole  in  the  plane  wave  trans- 
mission coefficients  T and  T , and  in  the  reflection  coefficient  R.  Let  k = k sin  0 be 

p s ^ 

the  transverse  wave  number  of  a plane  wave  incident  at  the  angle  0.  The  transmission 
coefficients,  as  a function  of  k , may  be  expanded  about  the  pole  at  k = k.  in  a Laurent 
series.  Retaining  only  the  first  term  then  gives  a good  approximation  to  Tp(k^)  and 
T^(k^)  for  k^  in  the  vicinity  of  P. 

The  approximation  described  above  takes  the  form 


' 1 


' 1 

ir^kj 


(2) 


where 

N (k  ) = 2k  {/c^  - kj) 
P X s s x' 

NJk  )=  4kf  K k 


(3) 
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are  the  numerators  of  T and  T with 

P s 


K (k  ) = Jk  - k" 
s X V s X 


K (k  ) = Jk*-  - k"- 

p X V p X 

Also  D'(k  ) is  the  derivative  of  the  denominator 

X 

? 2 2 2 4 / 

D(k  )=4k^fCK  +(K^-k)  + — k k / K, 
'x  xsp  s X psp'f 


Kf(kx)  = 7k‘--k;  (6) 

and  p^,  p being  the  mass  densities  of  the  liquid  and  solid,  respectively.  Since  k^  is 

close  to  k-j,  the  Rayleigh  wave  number,  we  may  make  the  approximation  of  replacing 

k,  in  N (k,  ),  N (k,  ) and  D'(k,  ) by  k Thus  Eq.  (2)  becomes 
X p £ S X X ^ 

T (k  ) « -f  — k : k~ 

P ^ D (k_)  X 
R 


T „ , ■ t 

• * D'(k,j|  ''x  - 


A simple  Laurent  expansion  cannot  be  used  for  the  reflection  coefficient  R.  For 

a lossless  solid  (R)  = 1 for  k > k . Thus  R must  have  a zero  k^  located  at  the  point 

k_  = k!  . The  correct  form  found  for  R is"^ 

0 jf 


k - k^ 

X 0 

k - k, 

X jf 


When  loss  is  present  in  the  solid  k^  ^ k^  . Loss  may  be  introduced  into  the  computation 
of  k^  and  kp  by  allowing  the  longitudinal  and  shear  wavenumbers  to  have  complex  values. 
It  has  been  found  that  the  loss  tangent  of  the  longitudinal  wave  is  not  significant  in 
determining  kp  and  k^  , For  simplicity  we  therefore  take  both  loss  tangents  to  be  the 
same,  so  that 

k = k'  (l+id/2ir)  (9) 

p,s  p,s' 

where  d is  referred  to  as  the  loss  factor. 
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The  variation  in  the  pole  and  zero  location  in  the  complex  plane  as  a function 
o£d=d  =d  are  shown  in  Fic.  2 for  a water-stainless  steel  interface.  For  d=  Owe 

P ® , 

have  that  hQ=k^  , while  for  df  Oboth  and  k^  move  upward  as  d increases  --  the 
numbered  points  on  the  curves  refer  to  particular  values  of  d.  It  is  seen  that  at  some 
critical  value  d^  of  d,  the  zero  crosses  the  real  axis.  Thus,  a real-axis  zero  is  pro- 
duced which  corresponds  to  total  suppression  of  the  geometrically  reflected  field. 

Since  the  loss  factor  d depends  on  frequency,  the  value  d = d^  will  be  obtained  at  a par- 
ticular frequency.  The  existence  of  a frequency  at  which  the  reflection  is  a minimum 
has  previously  been  observed,  and  is  referred  to  as  the  frequency  of  least  reflection.  ’ 
Thus  the  approximation  of  Eq.  (8)  for  the  reflection  coefficient  explains  in  a simple 
way  the  dependence  of  R on  loss  for  plane  waves  incident  near  the  Rayleigh  --iifi’e 
The  plot  of  Fig.  2 has  the  same  general  features  as  those  previously  reported  for  an 

g 

aluminum -water  interface,  although  the  shape  of  the  curves  is  somewhat  different  owing 
to  the  differences  in  mass  density. 
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Fig.  2.  Loci  of  the  pole  k^j  and  zero  kQ  of 
R(kjj)  in  the  complex  kj^  plane  for  a 
water-stainless  steel  interface  with 
loss  factor  d as  parameter. 

As  part  of  the  study  reported  here,  we  have  investigated  the  accuracy  of  the  ap- 
proximation in  Eq.  (8)  by  comparing  numerical  results  from  Eq.  (8)  with  the  exact 
expression  for  the  reflection  coefficient.  The  deviation  of  A<1)  in  the  phase  of  the  approxi- 
mate expression  in  Eq.  (8)  from  the  exact  expression  for  R is  plotted  in  Fig.  3 as  a 
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Fig.  3,  Phase  deviation  A<j>  (equal  to  the  exact 
minus  the  approximate  phase  of  R)  as 
a function  of  in  the  vicinity  of  the 
Rayleigh  wavenumber. 

function  of  k normalized  to  the  fluid  wavenumber  k for  a water-stainless  steel  inter- 
X 

face . Curves  for  the  cases  of  no  loss  (d  = 0),  critical  loss  (d=d^=0.0683)  and  high  los s 
(d=  0.  5)  all  show  that  A4>  is  small  for  transverse  wavenumbers  k^  somewhat  greater 
than  the  shear  wavenumber  k^  of  the  solid.  Moreover,  A<j>  varies  slowly  with  k^,  except 
when  k is  close  to  k in  the  case  of  critical  loss,  so  that  the  relative  error  is  even 

X p 

smaller.  The  value  of  [Rj,  as  determined  from  Eq.  (8),  is  also  close  to  the  exact 
value,  so  that  Eq.  (8)  represents  a simple  but  accurate  approximation  for  R. 

B.  Fields  Excited  in  the  Solid  by  a Bounded  Beam 

For  the  realistic  case  of  a bounded  beam  incident  on  the  surface,  the  incident  field 

is  composed  of  a spectrum  of  plane  waves  distributed  over  a range  of  angles  about  the 

angle  of  incidence  0.  of  the  beam  axis.  As  a result  of  the  variation  of  R,  T and  T with 
6 1 p s 

k^=  k sin  0,  for  0^^  near  0^  , the  profile  of  the  reflected  beam  and  the  variation  of  the 
substrate  fields  along  x will  differ  substantially  from  that  of  the  incident  beam.  In  ad- 
dition, the  presence  of  loss  can  be  expected  to  influence  the  beam  profile  and  x variation 
as  well  as  the  amplitude. 

To  study  the  effect  of  loss  in  the  solid,  we  consider  an  incident  beam  having  a 
Gaussian  profile.  The  x component  of  particle  velocity  is  assumed  at  the  interface  z = 0 
to  have  the  form 

v^(x,  0)  = exp(-x^/wQ)  exp(ik^x)  (10) 

where  k^  = k sin  0^  and  Wq  is  the  half -width  of  the  beam  as  measured  parallel  to  the  inter 
face  (see  Figure  1). 
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The  Gaussian  beam  has  a plane  wave  spectrum  whose  amplitude  is  given  by 


V(k^)  = ~/T  Wq  exp[-(k^-  kj^)^WQ/4] 


(11) 


Using  the  plane  wave  representation,  the  fields  in  the  solid  can  be  found  as  a superposi- 
tion of  plane  waves  whose  amplitudes  are  determined  by  Eq.  (10)  and  the  plane  wave 
transmission  coefficients.  For  example,  the  x component  of  particle  velocity  in  the 
solid  is  given  by 

iK  z-i  ik  x 


Or  . r*  ^ IV  Afv  *'“1 

V (x,  z)  = — r V(k  )FTe  P-^Te  ® le  ^ 
x'  ’ ' P -'.oc  ^ P k-  s J 


IK  z 


dk 


(12) 


Similar  expressions  hold  for  the  other  field  components. 


For  wide  beams,  the  spectrum  of  plane  waves  in  Eq.  (12)  is  concentrated  about 


k^=  k^.  For  k^  close  to  p,  and  hence  to  k^^,  use  of  the  approximations  in  Eq.  (7)  for 


Tp  and  T^  is  justified.  The  plane  wave  fields  in  the  solid  for  k^  near  k^,  i.e.  , for 


k > k , will  have  k and  k imaginary,  as  seen  from  Equation  (4).  Thus  v (x,z)  will 
X s p S X 


decay  exponentially  into  the  solid,  so  that  we  focus  attention  on  observation  points  near 


the  surface.  For  these  observation  points  the  terms  exp(iK  z)  and  exp(iK  z)  will  be 

P ® 

slowly  varying  functions  of  k over  the  significant  pa  . t of  the  plane  wave  spectrum,  and 


may  therefore  be  treated  as  constants  in  the  integration.  Similarly,  the  factor  K^/k^ 


may  be  treated  as  a constant. 
Ev 

Eq.  (12) 


Evaluating  the  terms  taken  as  constants  in  the  integration  at  k^  = k^^  we  obtain  for 


pf  r^p 

(x,  z)  = — E 

< p L 


N (k  ) -|k  (k^)|z  Kg('^R)  N (k^)  -Ug(kj^)lz-, 


P "D'(kj^) 


'R  D'(kj^) 


(13) 


where 


_ -(k  - k.'^w^/2 

00  ' X 1 0'  xk  X 

E(x)  = Jir  w J r r. e ^ dK 

- CO 


IE  - k. 

X i 


(14) 


The  integral  in  Eq.  (14)  may  be  evaluated  in  closed  form,  and  is  found  to  be 

2 


E(x)  = erfc(  Y ) v^(x,  0) 


(15) 


where  erfc(y)  is  the  complementary  error  function  and 
+ i(k.  - k^  )wq/2 


X 


(16) 


J 


ACOUSTICS 


47 


The  polarization  and  z dependence  of  the  velocity  in  Eq.  (13)  is  that  of  a Rayleigh 
wave.  The  variation  of  the  fields  with  x results  from  the  excitation  and  re-radiation 
into  the  fluid  of  the  Rayleigh  wave.  Similarly,  the  other  field  components  can  be  shown 
to  be  those  of  a Rayleigh  wave  whose  amplitude  varies  as  E(x). 

The  maximum  magnitude  achieved  by  ) E(x)  ] occurs  when  k.  = P so  that  v in  Eq. 

(16)  is  real.  For  ] k.  - P|  w^  > tt/2,  the  maximum  value  of  | E(x)  | decreases  rapidly 
with  increasing  phase  deviation  | k.  - P]  . Thus,  strong  field  excitation  occurs  in  the 
substrate  only  when  the  incident  beam  is  close  to  phase  matching  with  the  Rayleigh  wave. 

C.  Power  Flow  in  the  Solid 


Using  Eq.  (14),  we  have  computed  the  total  power  P^(x)  carried  along  x,  per  meter 
along  y,  in  the  Rayleigh  wave  fields  by  integrating  the  local  Poynting  vector  over  z 
from  z = 0 to  z = oc . Thus 


00 

^x^^^  " / * 2'  t v(^>  ■ T*(x,  z)  • X q]  dz 


(17) 


where  T(x,z)  is  the  stress  tensor  and  represents  complex  conjugate.  It  is  found  that 
P^(x)  is  proportional  to  | E(x)  | . 

The  power  P (x)  is  plotted  in  Fig.  4 as  a function  of  x/w.  for  a water-stainless 
steel  interface  with  a beam  half-wdith  Wq=  0.  68/qq  and  several  values  of  the  loss  para- 
meter d.  Since  the  Rayleigh  wave  excitation  is  strongest  when  k^^  = P,  the  computation 
made  in  arriving  at  Fig.  4 assumes  the  angle  of  incidence  is  such  as  to  satisfy  this 
relationship  for  d=  0.  In  Fig.  4,  P^^(x)  has  been  normalized  to  the  total  power  in  the 
incident  beam.  The  quantity  is  the  attenuation  constant  of  the  leaky  Rayleigh  wave 
for  the  lossless  case  d=  0,  and  the  value  of  Wq  is  chosen  such  that  Pj^(x)  takes  on  its 
highest  possible  value  of  0.8. 


To  the  left  of  the  maximum  of  P (x),  which  occurs  at  some  value  x=x  , the  power 

X m 

in  the  Rayleigh  wave  increases  as  the  wave  is  excited.  Subsequently  its  amplitude  de- 
creases due  to  radiation  into  the  fluid,  and  due  to  acoustic  damping  for  d / 0.  The  max- 
imum power  in  the  Rayleigh  wave  is  seen  to  decrease  as  loss  increases.  This  effect 
is  more  clearly  shown  in  Fig.  5,  where  the  value  of  Pjj(^j^)*  normalized  to  the  power 
in  the  incident  beam,  is  plotted  as  a function  of  d for  several  values  of  normalized  beam 
width  'IqWq.  The  power  decreases  monotonically  with  d,  and  for  d = 0 has  a 

maximum  value  when  the  width  of  the  incident  beam  is  such  that  igW^  = 0.  68. 

From  Fig.  4 it  is  seen  that  for  each  curve,  the  maximum  slope  |dP^/dx|  tO  the 
left  of  the  peaK  decreases  as  the  loss  factor  d increases.  However,  for  d < d , the 
maximum  of  jdP^/dx]  to  the  right  of  the  peak  is  almost  independent  of  d.  These 


ACOUSTICS 


P^(Xm) 

(normalized) 


Px(X^) 

(normalized) 


\^o  '^o" 


Fig.  4.  Variation  of  the  power  car- 

ried by  the  Rayleigh  wave  with 
distance  along  the  interface  and 
loss  factor  d as  a parameter. 


Fig.  5.  Variation  of  the  peak 
Pjj(Xn,)  carried  by  thi 
wave  with  loss  factor 
normalized  h.itf-widtl 
a parameter. 


observations  suggest  that  for  d < d^,  the  presence  of  loss  interferes  with  tl 
of  the  leaky  wave,  but  does  not  significantly  increase  the  attenuation  consta 
leaky  wave.  This  effect  is  confirmed  by  the  small  movement  of  the  pole  k^ 
for  d < d^.  For  d > d^,  the  maximum  value  of  |dP^/dxl  decreases  with  d c 
of  the  peak,  indicating  that  loss  has  a significant  effect  on  a,  as  well  as  on 
tion  of  the  leaky  wave.  The  foregoing  discussion  is  consistent  with  the  inte 

g 

of  the  dependence  of  the  reflected  beam  profile  on  loss. 
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RAYLEIGH  WAVE  SCATTERING  FROM  A UQUID-FILLED  CRACK;  PART  I, 
VARIATIONAL  EXPRESSIONS  FOR  REFLECTION  AND  TRANSMISSION  COEFFICIENTS 

Y.  L.  Hou  and  H.  L.  Bertoni 


In  the  work  reported  here,  a variational  expression  was  developed  giving  the  re- 
flection and  transmission  coefficients  of  a Rayleigh  wave  at  a liquid-filled  crack.  The 
expression  gives  the  amplitude  and  phase  of  these  coefficients,  as  well  as  the  power 
scattered  into  the  bulk  waves.  Rayleigh  wave  scattering  from  cracks  is  a fundamental 
problem  in  elasticity,  and  is  of  interest  for  microwave  acoustics  and  nondestructive 
evaluation  (NDE),  which  are  areas  of  active  research  at  the  Polytechnic,  as  well  as 
seismology. 

Unwanted  cracks  on  the  surface  of  a crystal  will  scatter  the  Rayleigh  wave  in  a 
SAW  device  and  degrade  its  performance.  A knowledge  of  the  dependence  of  scatter  on 
crack  size  would  permit  formulation  of  criteria  for  maximum  defect  size.  Alternative- 
ly, machined  cracks,  or  narrow  grooves,  could  be  used  as  reflecting  elements  in  RAC- 
type  devices.  Here  it  is  essential  to  know  both  the  reflection  coefficient  and  the  power 
scattered  into  bulk  waves  in  order  to  design  and  evaluate  devices. 

An  important  problem  in  NDE  is  the  location  of  cracks  initiating  at  the  surface 
of  a part.  Since  the  initial  size  of  the  crack  determines  the  working  life  of  a part,  it 
is  important  to  find  cracks  above  a certain  size.  One  type  of  device  proposed  for 
locating  cracks  employs  an  elastic  beam  in  a prism  coupled  to  the  solid  under  test  by 
a liquid  layer  or  an  acoustic  beam  incident  from  a liquid  bath.  When  incident  at  the 
angle  for  excitation  of  the  Rayleigh  wave,  the  reflected  beam  contains  features  that  are 
strongly  perturbed  when  the  Rayleigh  wave  is  scattered  by  a crack.  Monitoring  the 
features  of  the  reflected  beam  can  then  be  used  to  locate  the  crack. 

In  this  initial  study  the  crack  is  assumed  to  have  slip  boundary  condition,  where- 
in the  normal  components  of  stress  and  particle  motion  are  continuous,  but  the  tangential 
component  of  stress  vanishes.  These  boundary  conditions  were  chosen  because  of 
interest  in  the  NDE  problem  described  above  in  which  a liquid  is  present  at  the  crack.  In 
the  case  of  a thin,  unstressed  fatigue  crack,  the  slip  boundary  conditions  may  be  more 
applicable  than  the  free-surface  boundary  conditions.  In  any  case,  the  slip  boundary 
condition  leads  to  the  simplest  variational  expression. 

The  crack  is  assumed  to  be  oriented  perpendicular  to  the  direction  of  propagation 
of  the  Rayleigh  wave.  Both  cracks  initiating  at  the  surface  and  located  wholly  within 
the  solid  are  considered.  While  the  depth  into  the  solid  is  finite,  the  length  of  the 
crack  parallel  to  the  surface  is  taken  infinite,  so  that  the  scattering  problem  is  two- 
diemsnional.  While  considerable  literature  exists  on  the  scattering  of  bulk  waves  by 
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cracks  (see  survey  in  Ref.  2),  Rayleigh  wave  scattering  by  a crack  at  or  near  the  sur- 
face of  a half-space  does  not  seem  to  have  been  treated  i^reviously. 

Starting  with  an  integral  representation  for  the  fields  inside  a solid  in  terms  of 
the  fields  at  its  surface,  an  integral  equation  is  obtained  for  the  jump  in  the  tangential 
component  of  particle  motion  across  the  crack.  The  variational  expression  is  then  ob- 
tained as  an  approximate  solution  to  the  integral  equation.  The  trial  function  in  the 
variational  expression  is  the  jump  in  the  tangential  component  of  motion. 

In  Part  II  of  this  two-part  report,  choice  of  the  trial  functions  is  discussed  and 
numerical  results  are  obtained  for  the  reflection  and  transmission  coefficients.  To 
obtain  convergence  of  the  integrals,  it  is  found  necessary  to  include  the  correct  edge 
condition  at  the  crack  tip.  Trial  functions  are  therefore  taken  as  the  product  of  the 
field  dependence  of  the  Rayleigh  wave,  and  a function  accounting  for  the  edge  condition. 

A.  Integral  Equation  for  Scattering  by  a Crack 

The  geometry  of  the  slip  crack  is  shown  in  Fig.  1(a)  for  the  case  of  a crack  orig- 
inating at  the  surface,  and  in  Fig.  1(b)  for  crack  inside  the  solid.  A Rayleigh  wave 
is  incident  from  x=  -«>,  and  both  it  and  the  crack  have  infinite  extent  along  y.  For 
generality  we  consider  the  crack  inside  the  solid,  and  recover  the  crack  at  the  surface 
by  setting  d = 0. 
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Fig.  1. 


Crack  geometry  for;  (a)  crack  originating 
at  surface;  and  (b)  crack  entirely  within 
solid,  indicating  surfaces  of  integration. 


Let  Gj^(x,  z;  x^,  z')  represent  the  two-dimensional  half  space  Gr  een's  function, 
i.e.  , the  i=  1,3  component  of  particle  displacement  due  to  a point  source  oriented  along 
the  K=  1,3  axis  and  having  harmonic  time  dependence  exp(jcot).  The  corresponding 
stress  field  is  denoted  by  t[J^\x.  z;  x',  z') . Further,  let  U.(x,z)  be  the  i = 1 , 3 component 
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of  particle  displacement  of  the  field  scattered  by  the  crack,  i.e.  , the  total  field  less 
the  incident  Rayleigh  wave  field.  The  stress  field  of  the  scattered  wave  is  Tj^j(x,z). 

The  Green's  function  fields  and  the  scattered  fields  satisfy  the  dynamic  equations. 
Thus,  for  any  closed  cylindrical  surface  parallel  to  the  y axis  and  having  a generating 

3 

curve  S(x,z),  it  can  be  shown  that 

z)  = ^ [Gj^(x\ z^;  X,  z)  Tj^^(x^,z^)  - U^(x^,  z^)  T|j^^(x\ z^;  x,  z)]  n.  ds(x\ z^) 

(1) 

Here  integration  is  taken  over  the  closed  curve  S(x',  z')  and  n^  represents  the  j=  1,3 
component  of  the  outward  unit  normal.  Summation  over  repeated  indices  is  implied. 


The  curve  S is  taken  as  the  free  surface  S^  in  Fig.  1(b),  the  semi -cylinder  C at 


infinity  and  the  sides  S_  and  S,  of  the  crack.  On  S , , the  stress  T.  . and  t'  ' vanish. 

^ i 1 ij  ij  ’ 

while  on  C all  field  quantities  vanish,  if  small  loss  is  assumed  to  insure  convergence. 
At  the  slip  crack  the  13  component  of  the  total  stress  must  vanish.  Therefore  T^^l^.z) 
is  the  negative  of  the  13  stress  component  of  the  incident  Rayleigh  wave,  which  is 
continuous  across  the  crack.  Thus  T,,(x,z)  is  continuous  across  the  crack,  as  is 
Tjj(x,z)  and  Uj(x,z).  Similarly  and  T;^  ' are  continuous  across  the  crack,  since 
they  apply  to  a semi-infinite  soild  without  a crack.  Only  Uj(x,  z)  is  discontinuous. 

Using  the  forging  properties  of  the  fields  at  the  surfaces  S^,  S^,  and  C,  the 
representation  theorem  in  Eq.  (1)  reduces  to 

d+h 


Uj,^(x,y)  = J z';  X,  z)  Jj(z')  dz' 

d 


(2) 


where  J^lz^)  is  the  jump  in  Uj(x,  z)  across  the  crack,  and  is  given  by 
3^{z')  = U^tO"^,  z')  - U3(0',  z') 


(3) 


From  the  dynamic  equation  for  isotropic  solids. 


(4) 


where  pis  the  shear  modulus.  Substituting  Eq.  (2)  into  Eq.  (4)  gives 


d+h 


Tj3(x,z)  = J A(z';x,z)  J3(z')dz' 
d 


(5) 


where 


A(z';  X,  z)  = p ^ T^^^(0,  z';  X,  z)  + p ^ z';  x,  z) 


9 


(6) 
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As  discussed  above,  at  the  crack  x = 0,  d<z<d+h,  the  stress  component  T ^^(0,  z) 
is  the  negative  of  T^j^(0,z)  for  the  incident  Rayleigh  wave,  which  is  given  by 


T^3(0,z)=  -2p4,k  [e 


Here  kj^  is  the  Rayleigh  wave  number  and  and  are  the  values  of  the  wave- 
numbers 


X (k  ) = - k^ 

p X V p X 

r (8) 

X (k  ) = /k^  - k^ 

s X Vs  X 

evaluated  at  k = k_  , with  k and  k the  longitudinal  and  shear  wavenumbers  of  the  solid. 
X K p s 

Evaluating  Eq.  (5)  at  x=  0,  d < z < d-th  we  therefore  find 

-T^^3(0,z)  =/  A(z';0,z)  J3(z')dz'  (9) 

d 

In  Eq.  (9),  T^3  and  A(z^;0,z)  are  known,  while  the  jump  J3(z^)  in  the  particle  displace- 
ment U3(z^)  is  unknown.  Equation  (9)  thus  represents  an  integral  equation  for  the  un- 
known jump.  This  equation  is  used  below  to  obtain  a variational  expression  for  the 
Rayleigh  wave  reflection  and  transmission  coefficients. 


B . Fourier  Representation  for  A and  the  Relation  Between  R and  T 

From  a consideration  of  the  spectral  representation  of  A{z' , x,  z)  it  is  possible 
to  find  a relationship  between  the  Rayleigh  wave  reflection  and  transmission  coefficients 
R and  T.  The  spectral  representation  is  then  required  to  find  a variational  expression 
that  gives  R and  T. 

The  stress  components  T^^^  and  T^^j^  in  Eq.  (6)  can  be  found  from  Eq.  (4)  with 

Uj^  replaced  by  appropriate  components  of  the  Green's  function.  Thus  A(z^x,z) 

consists  of  combinations  of  derivatives  with  respect  to  x,  x*,  z and  z'  of  the  Green's 

function  G„.  (x^,  z^  x,  z).  The  Green's  function,  and  hence  A(z^;x,  z),  can  be  separated 
iSj. 

into  particular  and  homogeneous  parts  A^(z  ; x,  z)  and  A^^(z  ;x,  z).  The  particular  portion 
corresponds  to  radiation  by  the  point  source  into  an  infinite  medium.  The  homogeneous 
portion  corresponds  to  the  fields  generated  by  reflection  of  the  point  source  radiation 
at  the  free  surface  z=  0. 

4 

The  homogeneous  portion  of  the  Green's  function  leads  to 
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Fig.  2.  Integration  path  along  the  Re  axis 
in  the  complex  plane,  indicating 
branch  points,  branch  cuts  and  poles 
of  Aj^(z';  X,  z). 

For  X < 0,  which  is  to  the  left  of  the  crack,  the  integration  path  in  Fig.  2 can  be 
deformed  into  the  upper  half  plane  leaving  a residue  contribution  and  the  contribution 
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be 

coming  from  integration  around  the  branch  cuts  originating  at  -k^  and  Sub- 

stituting the  particular  branch  cut  and  residue  portions  of  A into  Eq.  (5)  it  can  be 
shown  that  for  x < 0, 


d+h  p hr  1 

Tj2(x,z)=  j j^Ap(z';  X,  z)  + Aj^  (z';  X,  z)J  J^(z')  dz' 


2 F^(k^  J 


(12) 


4 p w 


where  F^(kj^)  is  the  derivative  of  F(k^)  evaluated  at  k^=  kj^  and 


,d+h  -j  K rj  z 

I = f e J,(z')dz' 


(13) 


and  is  similarly  defined  with  replacing  Kpj^.  In  deriving  Eq.  (12),  the  equalities 
e(kj^)  = were  used. 

The  left-hand  side  of  Eq.  (12)  represents  the  scattered  field  and  contains  two 
portions;  one  portion  being  the  bulk  wave  and  the  other  the  reflected  Rayleigh  wave 


-RT^j^(0,z)e 


Equating  this  reflected  Rayleigh  wave  field  with  the  portion  on  the 
right-hand  side  of  Eq.  (12)  having  the  Rayleigh  dependence  gives 


.K  = . ^ 

4 p 0)^  F'(l^) 


(14) 


For  X > 0,  that  is  to  the  right  of  the  crack,  the  integration  path  in  Fig.  2 can  be 
deformed  into  the  lower  half  plane.  As  a result,  Aj^  will  be  the  sum  of  a residue  con- 
tribution from  the  pole  at  kj^  and  the  integration  around  the  branch  cuts  originating  at 

k and  k . When  substituted  into  Eq.  (5),  one  obtains  the  same  expression  as  given  in 
P ® 

Eq.  (12)  with  kp^  replaced  by  -kp^-  The  left-hand  side  of  the  new  equation  will  consist 
of  the  bulk  wave  contribution  to  the  scattered  field  plus  the  Rayleigh  wave  term 
I ‘j  ^x 

(T  - l)Tj2(0,z)e  . Here  T is  the  transmission  coefficient,  and  when  the  foregoing 

Rayleigh  wave  term  is  added  to  the  incident  Rayleigh  wave  one  obtains  the  transmitted 
Rayleigh  wave.  From  arguments  similar  to  that  used  to  obtain  Eq . (14),  one  finds 


4 p *^pR 


(15) 


where  the  symmetries  c(-kj^)  = c(kp)  and  F'(-kp^)  = -F'(kp^)  have  been  used.  Comparing 
Eqs.  (14)  and  (15)  one  finds  the  relation 


R = 1 -T 

between  R and  T. 


(16) 
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C.  Variational  Expression  for  R and  T 

To  obtain  the  desired  variational  expression,  we  specialize  Eq.  (IZ)  to  the  case 
x=  0,  which  is  equivalent  to  using  the  Fourier  representation  for  A in  Equation  (9). 
Since  Tj2(0,z)  is  equal  to  -T^2(^>^)>  ^*'‘1  using  Eq.  (14),  we  obtain  from  Eq.  (12) 


d+h 


• u.T-11  p K 1 T 

-T^3(0,z)=  / |_A  (z';0,z)  + aJ‘"(z';0,z)j  J3(z')dz'  -RT^^3(0,z) 


(17) 


We  now  multiply  Eq.  (17)  by  13(2)  and  integrate  from  d to  d+h.  With  the  help  of  Eq.  (7), 
the  foregoing  integration  yields 


d+h 


(R-  l)(-2k^/Cp^)(Ip-I^)  = JJ  |_Ap(z';0,z)  + A“'^(z';  0,  z)j  J3(z')  J3(z)dz'dz  (18) 


To  complete  the  derivation,  we  divide  Eq.  (18)  by  (I  -I  ) and  use  Eq.  (14)  to 

P ® 

eliminate  l/(I  -I  ) from  the  left-hand  side  of  the  resulting  equation.  By  this  process 
P ® 

we  obtain 


I - R 
R 


2 p u)^F'(k^) 


- A^'"(z^  0,z)j  J3(z')J3(z)dz'dz 


(19) 


Equation  (19)  is  the  variational  expression  from  which  we  obtain  R.  It  is  variational  in 
the  sense  that  the  variation  of  the  right-hand  side  with  respect  to  13(2)  vanishes  when 
the  right-hand  side  is  evaluated  at  the  exact  solution  for  J3(z).  Proof  of  this  statement 
is  straightforward,  but  is  not  given  here. 

The  importance  of  Eq.  (19)  is  that  given  reasonable  guess  for  13(2),  we  obtain  a 
more  accurate  value  for  R.  Note  that  c(kj^)  and  F^(kj^)  are  known  functions  so  that  R 
can  be  determined  once  the  right-hand  side  of  Eq.  (19)  has  been  found  numerically. 
Since  J appears  twice  in  the  numerator  and  in  the  demoninator,  multiplication  of 
13(2)  by  a constant  will  not  change  the  value  of  R.  Evaluation  of  the  integrals  in  Eq. 
(19)  and  the  appropriate  choice  of  is  discussed  in  Part  II  of  this  two-part  report. 

National  Science  Foundation 

ENG75-00569  Y.L.  Hou  and  H.  L.  Bertoni 


REFERENCES 

1.  Y.L.  Hou  and  H.  L.  Bertoni,  "Evaluation  of  Substrate  Properties  Using  an  Acoustic 
Beam  in  a Prism  Coupled  via  a Fluid  Layer,  " Progress  Report  No.  40  to  JSTAC, 
Polytech.  Inst,  of  New  York,  Report  No.  R-452.  40-75,  82-92  (1975). 

2.  E.  A.  Kraut,  "Review  of  Theories  of  Scattering  of  Elastic  Waves  by  Cracks,  " IEEE 
Trans.  , SU-23,  162-167  (1976). 

3.  L.  Knopoff,  "Diffraction  of  Elastic  Waves,"  JASA,  217-229  (1956). 

4.  R.W.  Fredericks,  "Scattering  of  Elastic  Pulses  by  Obstacles  of  Infinite  Impedance 
and  Semi -Infinite  Dimensions  on  the  Surface  of  a Half-Space,"  Ph.D.  Thesis,  U.  of 
CaUf.  , Los  Angeles  (1959). 


1 


ACOUSTICS 


1 


[ 


It; 


57 


RAYLEIGH  WAVE  SCATTERING  FROM  A LIQUID-FILLED  CRACK:  PART  II, 

EDGE  CONDITIONS  AND  NUMERICAL  RESULTS 

Y.L.  Hou  and  H.  L.  Bertoni 

In  Part  I of  this  two-part  report,  we  have  discussed  the  motivation  for  studying 
the  reflection  of  Rayleigh  waves  from  cracks,  which  is  a fundamental  problem  of  elas- 
ticity with  application  to  SAW  devices,  NDE  and  seismology.  The  study  carried  out 
here  applies  to  scattering  from  infinitely  long  cracks  that  are  perpendicular  to  the 
direction  of  propagation  and  have  slip  boundary  conditions.  The  slip  boundary  condition 
applies  to  a liquid  filled  crack,  which  is  of  particular  interest  in  NDE.  The  analysis 
in  Part  I leads  to  a variational  expression  from  which  the  reflection  coefficient  R and 

transmission  coefficient  T could  be  found.  Further,  the  fraction  of  the  incident  power 

2 2 

scatter  into  bulk  waves  is  found  as  1 - I R|  - |t1  . 

The  variational  expression  contains  a double  integral  of  a product  consisting  of 
a kernel,  known  in  terms  of  its  Fourier  representation,  and  a trial  function,  which  is 
the  jump  Jj(z)  across  the  crack  of  the  particle  displacement  component  parallel  *^o  the 
crack.  In  this  report,  it  is  shown  that  the  double  integral  can  be  reduced  to  a single 
integral  over  wavenumbers,  provided  a suitable  choice  of  J^(z)  is  made.  The  choice  of 
J j(z)  must  take  into  account  the  field  behavior  near  the  edge  of  the  crack  required  by 
the  physical  principle  of  finite  energy  storage. 

With  the  correct  choice  of  J^{z),  the  resulting  wavenumber  integral  can  be  shown 
to  converge.  Numerical  data  obtained  from  numerical  integration  of  the  wavenumber 
integral  is  presented.  The  data  is  for  a crack  originating  at  the  surface  and  covers 
the  range  from  shallow  to  deep  cracks.  It  shows  only  weak  dependence  on  Poisson's 
ratio  for  the  solid  and  a surprisingly  large  amount  of  power  scattered  into  bulk  waves, 
even  for  deep  cracks. 

A.  Simplification  of  Integrals  for  N\imerical  Evaluation 

In  Part  I,  the  variational  expression  for  R was  obtained  in  the  form 

l-R  N hi 

where 

d+h-  K.-  1 

N=  // [a  (z';0,z)  - A^'^(z';0,z)J  J3(z')J3{z)dz  dz  (2) 

d 

and  the  other  terms  in  Eq.  (1)  are  defined  in  Eqs.  (11)  and  (13)  of  Part  I.  The  integrals 

I and  I are  single  integrals  over  finite  limits,  and  can  easily  be  evaluated.  However, 
P s 
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in  Eq.  (2)  is  given  by  the  wavenumber  integral  in  Eq.  (10)  in  Part  1 taken  over  the 

h 

contour  P in  the  complex  k^  plane  shown  in  Figure  1. 

The  function  A (z^;0,z)  comes  from  the  Green's  function  for  an  infinite  medium, 
and  can  therefore  be^represented  in  terms  of  various  derivatives  of  the  Hankel  functions. 
Alternatively,  it  may  be  expressed  as 


A (z';0.z)=  /"a(K)e-j'<<"-"'>c 

n 


where 


L [4k  k + (k^  - 2k^)^] 

k L xp  xs  ' s J 

xs  ^ 


k = ./  - K 

xp  V p 


k = JkJ  - K 

xs  w s 

Using  Eq.  (3)  and  the  appropriate  expression  for  A^*^,  Eq . (2)  can  be  written  as 

d+h  00  • / . j 

N = fj  j h^(z}J^{z  }dKdzdz 


-jK  (z+z')  -jK  (z+z') 


QT II  , ( ^ I ^ / 

+ Jl  J 


d(k^)  [* 


-j(K„z  + K^z')  + 1 


P s 


FirrrJ3<z)J3^^')dkxdzdz' 

X 

(6) 


Interchanging  orders  of  integration  in  Eq.  (6)  gives 


= /“a(K)l(K)I(-K)dK  + /[c(kjl^/<p)+  ^d(kJl{K^)l(K^)  + 


where 


Note  that  in  Eq.  (1),  Ip  = I(Kp)  and  = I(k^).  Thus,  if  the  form  of  J3(z)  is  sufficiently 
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♦ Im  Kj 


Fig.  1.  Branch  points,  branch  cuts  and  the 
deformed  integration  path  in  the 
complex  plane. 


Re  K 


X 


simple  so  that  the  integral  in  Eq.  (8)  can  be  carried  out  analytically,  then  only  a single 
numerical  integration  in  Eq.  (7)  over  wavenumbers  must  be  carried  out  in  order  to 
determine  the  right  hand  side  of  Equation  (1). 


The  path  P may  be  divided  into  four  parts.  Let  P be  the  part  running  from  -k 
to  k =0  and  then  along  the  Im  k axis  to  infinity.  In  the  limit  of  zero  loss,  the  branch 

X X 

cuts  in  Fig.  1,  defined  by  Im  = 0 and  Im  = 0,  coincide  with  portions  of  the  Re  k^ 
axis  and  with  the  Im  k^  > 0 axis.  In  this  limit,  the  portion  P^  of  the  path  lying  on  the 
opposite  side  of  the  branch  cuts  to  Pj  will  also  coincide  with  portions  of  the  Re  k^  axis 
and  the  Im  k >0  axis.  We  take  P_  as  the  segment  -k  < k < -k  and  P^  as  the  portion 
of  P opposite  to  P^  across  the  cut. 

The  functions  c(k^),  d(h^).  explicitly  functions  of 

K and  K , as  seen  from  Eq . (11)  of  Parti.  These  functions  have  the  symmetry  prop- 
P ® 
erty 


c(k^; 

-K  , 
P 

-«s>  = '^p*  '^s> 

d(k^; 

*^P’ 

-K^)  = -d(k^;  Kp,  K^) 

e(k^; 

-«s^  = '^s> 

(9) 


F(kx.-kp,  -«3)=  F(k^i«p. 


where  the  dependence  on  k and  k has  been  displayed  explicitly.  Using  these  symmetry 

P ® 

properties  and  parts  of  P described  above,  Eq.  (7)  can  be  written  as 

-k 


N=/  a(K)I(/<)I(-k)dK+/  nj(kjdk^+  / 
-00  P,  -k 


X 

(10) 
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[; 

I 


k • 


where 


= FTk  vie  Vk~) 

X p s ^ ^ ^ 

+ 2d(k^;K  K^)[l{K  )I(K^)  + I(-K  )I(-K  )]+  e(k^;K  ,K^)[l^K^)  + I^-K^)]} 


(11) 


and 


n,(k  ;k  ,K  ) - ?/i; --t ITT  |c(k  ;k  ,K  )I^(k  ) + 2 d(k  ;k  ,K  )I(k  )I(k  ) 

2 X p s F(k  fK  tK  } ^ xps  p xps  s p 


X p s 


+ e(k^;/<  K^)I  (K^)] 


(12) 


Along  the  path  P^,  the  wavenumber  K varies  from  0 to  o'  and  is  real.  Letting 


K 


P 


K 


(13) 


we  may  convert  the  integration  over  into  an  integral  over  0 < k < «>  by  change  of 
variable.  Noting  from  Eq.  (4)  that  a(k)  is  an  even  function,  Eq.  (10)  becomes 


,,  -k 

f dk  , _Pp  - 

N=  / {2a(K)I(K)I(-K)  + nj  [k^(K)]  ^ |dK  + | '"p’ 


(14) 


The  second  integral  in  Eq.  (14)  is  over  finite  limits  and  the  integrand  is  well  behaved 
so  that  the  integration  can  be  carried  out  numerically.  The  convergence  of  the  integral 
over  infinite  limits  in  Eq.  (14)  is  discussed  in  the  next  section. 

B.  Edge  Condition  and  the  Choice  of  J3(z) 


For  the  remainder  of  this  report  we  specialize  our  discussion  to  the  case  of  a 
crack  originating  at  the  surface,  so  that  d = 0,  and  penetrating  into  the  solid  to  a depth 
h.  At  the  surface  z = 0,  the  two  sides  of  the  crack  are  free  to  shear  or  slide  past  each 
other  so  that  1^(0)  is  finite.  However,  at  the  tip  of  the  crack  (x=  0,  z=  h)  the  two  sides 
of  the  crack  are  constrained  by  the  surrounding  medium  so  that  no  shear  is  possible 
and  hence  Jj(h)  = 0. 

In  order  for  the  energy  stored  in  the  vicinity  of  the  edge  to  be  finite,  the  particle 
displacement  must  vary  as  the  one  half  (or  higher  power)  of  distance  from  the  edge. 
Thus  J2(z)  must  vary  as  (h  - z)*^,  where  q > 1 /2,  near  the  edge  ^ Aside  from  the  re- 
quired variation  near  the  edge,  J^(z)  should  have  variation  with  z that  is  similar  to  the 
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k • 


^mponent  of  particle  displacement  of  the  Rayleigh  wave.  Thus  we  assume  the  form 

pR  i (15) 


J_j(z)  = f(z)j_e 


- -J«sR^ 

= e 


2k 


R 


where  f{z)  is  the  function  accounting  for  the  edge  condition. 

We  have  considered  two  different  choices  for  f(z).  The  first  choice  was 
f(z)  = Jh  - z ( 1 

Using  Eq.  (16)  in  Eq.  (15),  the  integration  in  Eq.  (8)  can  be  obtained  in  closed  form  in 
terms  of  complex  error  functions . ForK-’*^,  itis  found  that  I(«)  varies  as  l/x.  Us- 

ing this  it  can  be  shown  that  the  integrand  in  the  first  integral  of  Eq.  (14)  varies  as 
1/k^  for  K -*  Thus  the  infinite  integral  converges.  If  J^(z)  did  not  vanish  at  z=h, 
then  the  infinite  integral  would  not  converge,  so  that  the  edge  condition  is  essential  in 
obtaining  a variational  solution. 

The  second  choice  of  f(z)  used  was 


f(z)  = cos(  ^ z)  ( 1 ‘ ) 

which  varies  linearly  near  the  edge  z=  h.  This  choice  of  f(z)  gives  a simpler  expression 
for  I(/f)  than  does  Equation  (16).  The  infinite  integral  in  Eq.  (14)  is  also  found  to  con- 
verge for  f(z)  in  Eq.  (17),  and  the  computation  time  is  much  less  than  needed  when  Eq. 
(16)  is  used.  The  numerical  results  obtained  for  the  reflection  and  transmission 
coefficients  R and  T using  Eqs.  (16)  and  (17)  are  close  to  each,  indicating  that  the  var- 
iational expression  is  not  sensitive  to  the  choice  of  Jj(z) 

C.  Numerical  Results  for  a Crack  Originating  at  the  Surface 

z z 

Computation  of  R,  T and  the  scattered  power  1 - jR]  - 1t|  were  made  for  slip 
cracks  originating  at  the  surface  of  aluminum  and  polyethelene.  For  comparison  both 
choices  in  Eqs.  (16)  and  (17)  of  f(z)  were  used  for  the  aluminum  solid. 

Figure  2 shows  the  variation  of  |R|,  |t|  and  1 - |r|‘'-|t1  asa  function  of 
crack  depth  h normalized  to  the  Rayleigh  wave  length  = 2ir/k^.  In  computing  the 
results  presented  in  Fig . 2 , Eq . (16)  was  used  for  f(z).  For  h/ < 0.2,  the  scattering 
of  the  Rayleigh  wave  is  very  small,  while  for  tn  the  range  0.2  to  0.6,  the  scatter- 

ing increases  sharply.  As  h/X^^ increases  beyond  0.6,  1r|  approaches  0.68;  [t]  ap- 
proaches 0.  32  and  the  fraction  of  the  incident  power  scattered  into  bulk  waves  approaches 
0.43.  For  h > X_,  the  scattering  is  not  sensitive  to  crack  depth  since  most  of  the 

XV 

power  in  the  incident  surface  wave  is  concentrated  near  the  surface.  In  Fig.  3 we  have 
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Fig.  4.  R,T  and  scattered  power  for  Rayleigh  wave  incident 
on  a slip  crack  in  aluminum  using  f(z)  = cos(tt  z/2h). 


-SO*- 

Fig.  5.  Phase  of  R and  T for  Rayleigh  wave  incident  on 
a slip  crack  in  aluminum  using  f(z)  = cos(irz/2h) 
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plotted  the  phase  angles  of  R and  T.  As  with  |r|  and  |T|,  rapid  variation  in  the  phase 
occurs  in  the  range  0.  2 < h/X^^  < 0.  6.  For  h > X^^,  the  phase  angles  approach  a con- 
stant. 

Results  obtained  for  aluminum  but  using  the  choice  in  Eq.  (17)  for  f(z)  are  plotted 
in  Figures  4 and  5.  Comparing  Figs.  2 and  4 and  Figs.  3 and  5 it  is  seen  that  the  re- 
sults are  not  sensitive  to  f(z),  and  hence  Jj(z).  The  major  difference  appears  to  be  in 
the  phase  angles  of  R and  T,  and  even  these  differences  are  small.  Because  the  comput- 
ing time  is  much  less  when  £q.  (17)  ib  used,  and  since  the  results  using  Eqs.  (16)  and 
(17)  are  nearly  identical,  the  form  of  Eq.  (17)  for  f(z)  is  used  for  subsequent  calcula- 
tions. 

It  is  desirable  to  know  how  the  scattering  will  depend  on  the  material  being  studied. 
It  can  be  shown  that  the  variational  expression  depends  on  the  material  properties  only 
through  Poisson's  ratio  a.  For  aluminum  a = 0.355,  while  for  polyethylene  o = 0.458. 
Computations  made  for  polyethylene  are  shown  in  Figures  6 and  7.  It  is  seen  that  this 

z z 

change  in  o produces  only  minor  changes  in  R,  T and  1 - jR]  - jTj  . Further  compu- 
tations will  be  made  over  a wider  range  of  values  of  o to  determine  the  influence  of  o 
on  the  scattering. 
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STABIUTY  AND  NOISE  ANALYSIS  OF  THE  ACOUSTOELECTRIC  PHASE- LOCKED 
LOOP  DEVICE 

H.  Schachter,  F.A.  Cassara  and  L.  Rosenheck 

12  3 

We  have  previously  reported  ' • that  the  configuration  of  Fig.  1 acts  as  an 
acoustoelectric  phase-locked  loop  (PLL).  In  summary  if 

e^(t)  = A sin(Wjt  + 

and 

e^{t)  = B cos(Wjt  + ^|>2(t)) 

are  added  at  the  input  to  the  transducer  of  Fig.  1,  they  will  give  rise  to  two  Rayleigh 


Fig.  1,  Structure  of  a nonlinear  acoustic 
surface-wave  device.  It  performs 
both  the  functions  of  multiplication 
and  low -pass  filtering. 

acoustic  surface  waves.  The  electric  fields  carried  by  the  Rayleigh  waves  will  generate 
oscillatory  charges  and  electric  fields  in  the  nearby  silicon  plate.  The  nonlinear  inter- 
action between  these  charges  and  electric  fields  gives  rise  to  currents  at  difference 
frequencies  in  the  silicon.  Finally,  tne  open  circuit  voltage  resulting  from  integrating 
these  currents  over  the  length  of  the  semiconductor  is  used  to  control  the  frequency  of 
a voltage  controlled  oscillator  (VCO),  i.e.  , where  P is  the  sensitivity  of 

the  VCO.  The  relation  governing  the  behavior  of  the  PLL  was  shown  to  be 

+ (E^n^  + E^nj)  J sin[wjt  - w^t  + U(t  - ^)  dz  (1) 

where:  Ej  and  are  the  amplitudes  of  the  two  electric  fields,  n^,  n^  the  respective 
generated  charges,  ng  the  thermal  equilibrium  charge  distribution  in  the  silicon,  and 
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finally,  v the  propagation  velocity  of  the  Rayleigh  waves.  It  was  found  useful  to  nor- 
malize £q.  (1).  Let 


2 0 -^l“2  ^ ^2  1 

n ^ 2n^ 

(2a) 

T = u»  t 
n 

(2b) 

u>  L 
a=  ” 

V 

(2c) 

PL(E  n + E-n  ) 

Y = « 

0 n 

(2d) 

II 

(2e) 

then  Eq.  (1)  becomes 

= J(tU(t)  - (T  - q)  U(T  - a))  + sin(W  X - W X + 4'.(X)  - v|>2(X))dX  (3) 

T-Q 

Let  4<j(t  ) = 0,  then  in  steady  state  the  VCO  frequency  will  follow  that  of  the  input  signal 
with  a phase  constant  difference,  i.  e. , 

4>2(t)  = WjT  - W^T  - 4<g 
and  solving  Eq.  (3)  yields 

AW  = Wj  - = V + a sin  (4a) 

According  to  Eq.  (4a)  the  hold-in  range  will  vary  between 

Y - a < Wj  - W2  < Y + “ (4b) 

since 

-1  < sin  < 1 
A.  Stability  Analysis 

As  shown  in  Eq.  (4b)  the  maximum  hold-in  range  (i.e.  , lock-range)  for  the  PLL 
should  be  y 4 a.  In  order  to  verify  dynamically  the  static  prediction  of  Eq.  (4b), 
computer  solutions  of  Eq.  (3)  were  carried  out  for  values  of  Y = 0,  a,  2a  with  a and  41^ 
as  parameters.  For  a given  value  of  a the  maximum  of  4;^  for  which  the  solution  was 
stable  were  calculated  and  the  results  are  plotted  in  Figure  2.  Since  y /a.  was  kept  a 
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constant  for  these  calculations,  only  the  variations  with  a are  considered. 


For  a small,  the  maximum  hold-in  range  varies  linearly  with  a as  predicted. 
However,  for  large  values  of  a,  the  inherent  nonlinearity  of  the  system  limits  the  hold- 
in  range,  and  the  value  of  AW  remains  a constant  with  an  increase  in  the  value  of  a. 

These  results  were  verified  experimentally  as  shown  in  Figure  3.  The  experi- 
mental set-up  was  identical  to  that  described  in  References  1 and  2.  Here  the  maximum 
hold-in  range  versus  the  variations  on  A/B  (the  ratio  of  the  amplitude  of  the  input  volt- 
age to  the  output  of  the  VCO)  is  plotted.  In  this  experiment  the  output  amplitude  of  the 
VCO  was  kept  constant  and  the  DC  component  of  the  VCO  input  was  compensated.  Ana- 
lytically this  corresponds  to  setting  the  first  term  on  the  right  hand  side  of  Eq.  (3)  equal 
to  zero. 


With  Y = 0,  Eq.  (4)  reduces  to 


A(*>  = u)  a 
n 


PLE^n^  A 


(5) 


A comparison  of  Figs.  2 and  3 shows  that  the  hold-in  range  increases  linearly  but  sat- 
urates for  large  values  of  a or  equivalently  (see  Eq.  (5))  for  large  values  of  A/B, 
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«2 

Fig.  3.  PLL  hold-in-range  versus  ratio  of 
input  signal  ampUtude  to  V CO  output 
signal  amplitude. 

In  summary  we  observe  that  in  order  to  get  a maximum  hold-in  range  and  at  the 

1 3 

same  time  have  a system  with  a good  transient  behavior  ’ we  should  choose  an  a of 
about  1 . 4 to  1.6. 

B.  Bandwidth  of  PLL 

The  PLL  is  accompanied  by  an  IF  bandpass  filter  in  the  input  and  a low  pass  filter 
in  the  output.  The  closed  loop  bandwidth  of  the  PLL  is  optimal  at  about  twice  the  band- 
width of  the  IF  Biter  and  of  course  much  wider  than  the  bandwidth  2f  of  the  post  de- 
4 m 

tection  filter. 


In  order  to  design  the  accompanying  filters  it  is  of  interest  to  know  the  closed 
loop  bandwidth  of  the  PLL.  Linearizing  Fq.  (3)  it  is  easy  to  show  that  the  closed  loop 
response  of  the  PLL  is  given  by 


_ 1 . 
+7^"  1 


(6) 


where  W = the  normalized  frequency.  Note  that  the  PLL  is  a low  pass  filter 

whose  gain  is  one  at  DC  and  whose  3 dB  bandwidth  Wg  is  given  by  the  solution  to  the 
equation: 
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4(1  - cos  WgO)  = 2(1  - Wg)(l  - cos  WgO)  + wj  (7) 

which  for  small  values  of  a reduces  to  W_  = a. 

B 

A plot  of  Wg  vs,  a is  given  in  Figure  4.  We  see  that  the  bandwidth  increases  with 
Q and  reaches  a maximum  at  a = 1.5  and  decreases  after  that.  Since,  as  we  saw  earlier, 
an  a of  about  1. 5 represents  also  a maximum  hold-in  range  and  a good  transient  re- 
ponse,  this  suggests  that  designing  a PLL  with  an  a between  I,  4 and  1. 6 would  represent 
an  optimum. 


04  0.8  1.2  1.6  2.0  2.4  2.8  3.2  3.6  40 

a 

Fig.  4.  PLL  normalized  3 dB  bandwidth 


(Wb>  versus  a. 


C.  Noise  Response  of  PLL 


When  the  input  signal  is  corrupted  by  noise,  additional  a.m.  and  f.m.  modulation 
appears.  Let  us  assume  that  the  corrupting  noise  is  gaussian  with  zero  expected  value. 
Then  the  noise  can  be  written  as 

n(t)  = a(t)  cos  cjjt  - b(t)  sin  co^t  (8) 

where  a(t)  and  b(t)  are  slowly  varying  gaussian  stochastic  processes  of  zero  expected 
value.  Assuming  that  the  dominant  pole  of  the  low  pass  equivalent  of  the  predetection 
IF  filter  is  then  the  autocorrelation  function  of  a(t)  and  b(t)  is 
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where  is  the  center  frequency  of  the  input  signal  and  2wjp  is  the  3 dB  bandwidth  of 
the  IF  filter  preceding  the  PLL. 


With  the  addition  of  noise  the  input  signal  becomes 
ej(t)  = A sin(Wjt  + 4)^)  + a cos  w^t  - b sin  to^t 

Rewriting  we  get 

ej^(t)  = AR(t)  sin(u)^t  + + 6{t)) 


C2(t)  = B cos(w2t  + 


where 


R(t)  = ^ ^ i:®"”  '*^1  ■ ^1 


(9a) 

(9b) 

(9c) 

(9d) 


and 


tan  6 = 


U 1 D ■ I 

^ cos  +1+7^  sin  4j^ 

, , a ! \ E ! 

1 + ^ sin  4jj  - ^ cos  4j^ 


(9e) 


With  e^(t)  and  ©2(0  as  in  Eqs.  (9b)  and  (9c)  the  normalized  governing  relation  of  the 
PLL  becomes 


4/2('^)  = ^/  R^X)U(X)dX 


T 

T-Q 

T 


T-a 


+ ^/  U(\)dX+/  R(\)  sin[(Wj  - W2>X  + 4^j(\)  + 0(X)  - 

(10) 


where  t and  a were  defined  in  Equations  (2b)  and  (2c). 

1 , Rectification  Effects 

We  notice  that  the  noise  will  contribute  some  rectification  which  will  result  in  a 
shift  in  the  hold-in  range  of  the  PLL.  For  small  noise,  assuming  that  in  steady  state 
the  average  value  of  the  sin[(Wj  - W2)X  + +|(X)  + 9(X)  - 4^2(X)]  = sin  4>g  we  get  with 
4-j(T)  = 0 and  4«2(t)  = 


(Ha) 
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Realigning  and  recognizing  that 

A . B _ Y 
2B  2X  Q 


we  obtain 


AW 

noise 


AW  + 

no  noise 


1)  + Q sin  4>^(  R - 1 ) 


(lib) 


- 2 
where  R and  R represent,  respectively,  the  expected  values  of  R (t)  and  R(t).  Car- 
rying out  the  expectations  we  find  that 


AW 


noise 


AW 


no  noise 


o , A , 3 . 
“ ^2  ^B 


This  shows  that  the  rectification  brought  by  noise  will  increase  the  maximum  hold-in 
range. 

2.  Output  Signal -to -Noise  Ratio 

We  will  now  calculate  the  output  signal-to-noise  ratio  (SNR)  of  the  PLL  under  the 
conditions  of  input  carrier  mixed  with  a weak  gaussian  noise  (^«  !)•  That  is,  we 
assume  that  the  system  is  at  equilibrium  with  4^2  ” (^j  * Wt)t  + 4/^  = 0 and  now  we 

introduce  noise  which  disturbs  this  equilibrium. 

Since  we  consider  weak  noise,  we  let 

R(T)  1 

and 

e(T)  « ^ 

under  the  assumption  of  T > 0 and  differentiating  Eq.  (10)  we  get 


= sin[(Wj  - W^)T  + ^ - 4-2(^)]  - sin((Wj  - W2)(t-q)  + 

(12a) 

If  we  let  4<2('’' ) = (^j  ■ ^^2^^  ^ ^ where  <t>2('’^ ) disturbance  from  equilibrium 

of  4<2(’'^ ) introduced  by  noise  and  assume  that  a(T  ) A - <t>2(^  ) small  we  get: 

+ cos  +g(4>2(T)  - 4>2(t  - a))  = ( ^^  - - ) cos 


(12b) 
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Let  S^y^(W)  be  the  spectrum  of  a/A  and  S^^(W)  the  spectrum  of  4.^(1)  then 


2 cos^4^  (1  “ cos  W q)  W 


(13) 


w"*+  2 cos  4i^(cos  W a - 1)(W  - cos  4^^) 


The  spectrum  of  a/A  is  given  by 


2W, 


^a/A 


(W) 


IF 


~Z  — 2 
A W + W 


IF 


when  W is  half  the  bandwidth  of  the  predetection  IF  filter . Assuming  that  the  PLL  is 
followed^by  a post  detection  filter  of  bandwidth  the  variance  of  will  be  given  by 


W 


0 2 


^2  A^Zir  "0 


I. 


M 2W, 


IF 

r~T7T  ■ 


2 cos^4j  (1  - cos  W a)W^dW 


. 2'  ~ " 

w"+  w'^+  2cos  41  (cos  W a - 1)(W  - cos  4^^) 

IF  s 


(14) 


if  Wj^a  « 1,  and  « 1,  which  is  usual  in  the  case  of  PLL,  Eq.  (14)  can  be 

easily  integrated  to  give 
2 

rvi 

3 


0 

7T, 


2 arctan 


(15) 


'•Z  itA‘'Wjp  D 


where 

2 42 

, 12  - 12  a cos  4/„  - a cos  4^ 

,-,2  s s 

D = 2 T 

12  Cl  cos  4j 

s 

Equation  (15)  represents  the  detected  RMS  output  noise  power.  The  detected  output 
signal  power  under  no  noise  conditions  is  [W  ^ • 

In  a conventional  first  order  PLL  the  governing  equation  with  a weak  corrupting 
. 5 

gaussian  noise  is 


41^(0  = C sin((Wj  - W^)!  + 4>]^(t)  + 6(t)  - 4-2(t)) 


(16) 


where  the  notation  used  is  consistent  with  the  acoustic  PLL  notation  previously  defined. 
As  before, 

4-2(t)  = (Wj  - W^)t  + 

where  disturbance  due  to  noise. 
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Substituting  Eq.  (17)  into  Eq,  (16)  and  linearizing  we  get 
^^(t)  = C(0  - cos 


Taking  the  spectrum  of 

c c-2  2,2 

S.C  cos  l|j  b) 


ii)  + C cos  4^ 


(1«) 


(19) 


Assuming  a post  detection  filter  of  bandwidth  and  a constant  over  0 < oj  < we 


M 


obtain 


2 ^3  3 , 

O;  = C COS  Jj 

<1,2  tt 


r. 


M 


s I C cos 


arctan 


“m  ! 


C cos 


J 


(20) 


Comparing,  we  see  that  C cos  4^g  Iri  Eq.  (20)  corresponds  to 


1 j 12Q^cos^it^g 

D ~ JTI  rr~2  I 4 T~ 
V IZ  - IZa  cos  4^  - a cos  4> 
^s  s 


in  Eq.  (15)  which  for  a cos  4>^  « 1 becomes 


— r Q cos  4^ 
D s 


(21) 


That  is  for  small  a the  acoustic  PEL  behaves  as  a first  order  PEL  with  closed  loop 
bandwidth  a which  agrees  with  previous  interpretations.  Also,  substitution  of  Eq.  (21) 
in  Eq.  (15)  reveals  that  the  r.m.s.  output  noise  for  the  conventional  first  order 
PEE  becomes  identical  with  the  acoustic  PEE.  The  noise  response  for  the  PEE  out- 
side the  linear  region  is  currently  being  investigated. 

D.  Conclusions 

The  hold-in-range,  closed  loop  bandwidth,  and  weak  noise  response  of  the  acousto- 
electric PEE  have  been  derived.  The  computer  studies  presented  reveal  that  a practical 
"rule-of-thumb"  to  achieve  good  transient  response  and  maximum  hold-in-range  is  to 
design  the  PEE's  normalized  "delay  time"  a equal  to  1.5.  In  addition,  it  has  been  dem- 
onstrated that  the  detected  output  signal-to-noise  ratio  response  of  the  acoustoelectric 
PEE  and  the  conventional  PEE  are  identical  when  the  input  FM  carrier  is  corrupted  by 
additive  weak  narrowband  gaussian  noise. 
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SPECULAR  REFLECTION  FROM  SODIUM  VAPOR 
V.  Nangia  and  B.  Senitzky 

Use  of  specular  reflection  from  vapors  as  a means  for  developing  optical  filters 
was  suggested  a couple  of  years  ago.^  Investigations,  both  theoretical  and  experimental, 
were  carried  out  on  mercury  vapor  filters  and  the  results  obtained  at  2537°A  were 
very  encouraging.  The  mercury  vapor  cell  operated  as  a narrowband  filter  with  a 
0.  14  A bandwidth,  having  a 7°  acceptance  angle.  It  provided  28  dB  peak-to-skirt  rejection 
ratio  with  a peak  14%  reflectance  on  a one  inch  diameter  aperture.  This  filter  was  used 
to  get  a monochromatic  photograph  of  distribution  of  mercury  atoms  in  a flame  and  to 
determine  the  concentration  of  mercury  in  water.  A list  of  26  elements  that  could  be 
investigated  for  a similar  purpose  was  proposed.^  In  order  to  investigate  the  applica- 
tion of  the  device  principle  to  one  of  the  more  "difficult"  elements  on  the  aforementioned 
list,  we  decided  to  build  a sodium  vapor  filter. 

A.  Experiment 

Sodium  at  high  temperatures  attacks  quartz;  therefore  we  decided  to  build  a cell 
consisting  of  a nickel  housing  with  sapphire  window.  Attempts  made  to  bond  sapphire 
to  nickel  to  withstand  temperatures  around  900°C  were  unsuccessful.  Consequently,  it 
was  decided  to  use  a commercially  available  seal  that  withstands  temperatures  to  about 
600°C.  This  seal  was  attached  to  one  end  of  a nickel  tubing  (Fig.  1)  by  electro-deposition 


Fig.  1.  Vapor  cell  joined  to  Kovar-Pyrex  seal. 

of  nickel.  The  other  end  of  the  tubing  was  attached  to  a Pyrex-Kovar  seal  (also  by 
electro-deposition  of  nickel).  These  electro-formed  joints  have  been  found  to  be  vacu- 
um-tight during  heat-cycling.  This  assembly  was  connected  to  a vacuum  station  (Figure 
2).  A sodium  ampule  was  connected  directly  above  the  cell.  After  baking  the  cell  at 
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Fig.  2.  Vapor  cell  and  sodium  ampule  connected 
to  vacuum  station. 

600°C,  sodium  metal  was  melted  with  a propane  torch  (care  being  taken  for  uniform 
heating  all  along  the  length  of  the  ampule  and  the  connecting  glass  tubing),  so  that  a 
small  droplet  gradually  flowed  down  into  the  nickel  tubing  of  the  cell.  The  glass  tube 
was  then  tipped  off.  The  nickel  tubing  was,  thereafter  flattened  and  pinched  using  a 
pinch-off  tool  which  exerted  4500  psi  of  pressure.  The  pinched  edge  was  electron -beam 
welded  in  vacuum.*^ 

The  cell  was  mounted  in  an  oven  and  oriented  to  receive,  at  Brewster's  angle,  in- 
coming polarized  light  from  a narrow-slit  object  illuminated  by  a sodium  discharge  lamp. 
The  intensities  of  light  reflected  both  from  the  front  surface  and  the  back  surface  of  the 
window  were  measured  using  a photo-multiplier  tube  with  a lock-in  amplifier  and  a light 
chopper.  The  experimental  arrangement  used  is  shown  in  Figure  3.  The  two  images 
produced  on  reflection  from  the  two  surfaces  were  separated  so  that  each  could  be  scan- 
ned by  the  photo-multiplier.  The  birefringence  of  the  sapphire  did  not  affect  our 
measurements  because  we  oriented  the  optic  axis  of  the  window  perpendicular  to  the 
direction  of  polarization  of  the  incoming  signal  (we  determined  this  position  by  finding 
the  intensity  minima  of  the  two  images). 
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The  results  obtained  for  the  reflected  intensities  for  two  different  temperatures 
are  shown;  these  curves  show  the  variation  of  reflected  intensity  as  a function  of  posi- 
tion (Figure  4).  Similar  experiments  were  performed  using  a He-discharge  lamp  and 
the  results  for  that  are  shown  in  Figure  5.  From  these  curves,  it  is  found  that,  in  the 
case  of  sodium-line  the  reflected  intensity  from  vapor -sapphire  surface  relative  to  air- 
sapphire  surface,  increases  by  a factor  of  11.7  for  temperature  rise  from  25°C  to 
595°C.  In  the  case  of  sodium-line,  this  factor  is  1.6  for  a temperature  rise  from  25°C 
to  585°C.  Thus  the  relative  reflectance  for  sodium -line  is  about  seven  times  larger 
than  that  for  helium  line. 

The  above  results  are  conclusive  proof  that  we  are  obtaining  specular  reflection 
from  sodium  vapor  and  indicate  that  a sodium  vapor  filter  is  feasible.  We  are  present- 
ly obtaining  quantitative  data  on  this  phenomenon,  which  we  hope  to  compare  to  our 
theroetical  estimates. 
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Fig.  4.  Reflected  intensity  vs.  image 
position  for  sodium  line. 


POSITION  OF  IMAGE  /mm 

Fig.  5.  Reflected  intensity  vs , image 
position  for  helium  line. 
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WAVEGUIDES  FOR  INTEGRATED  OPTICS  FORMED  BY  METAL  PLATINGS 
A.  A.  Oliner  and  S.  T.  Peng 

Most  wavegviides  which  have  been  proposed  for  use  in  integrated  (planar)  optics 
consist  of  combinations  of  dielectric  materials.  It  is  less  well  known  that  metal  strips 
plated  on  thin  films  on  a substrate  can  also  guide  optical  energy.  Such  metal  strips 
have  been  incorporated  into  devices  for  integrated  optics  (such  as  modulators  and 
switches)  which  require  the  application  of  electric  fields,  but  the  developers  have  often 
been  unaware  of  the  guiding  properties  of  those  devices. 

The  two  structures  to  be  discussed  here  are  shown  in  Figures  1 and  2.  Figure  1 
shows  an  optical  slot  waveguide,  formed  of  a slot  or  gap  of  unplated  film  on  a substrate 
between  two  plated  regions.  The  discussions  in  the  literature  for  this  waveguide  type 
are  incorrect  for  one  polarization,  as  will  be  explained  below.  The  second  configura- 
tion, shown  in  Fig.  2,  is  a new  waveguiding  structure  consisting  simply  of  a strip  of 
metal  plated  on  a film  on  a substrate. 


dielectric 
film  or  layer 


Fig.  1.  The  slot  waveguide. 


metal 


Fig.  2.  The  metal-strip  waveguide. 


At  optical  frequencies,  metals  behave  as  overdense  plasmas,  so  that  they  possess 
dielectric  constants  that  are  essentially  negative  real.  It  is  this  property  that  permits 
the  guiding  to  occur  without  excessive  loss.  In  addition,  the  performance  of  these  guides 
is  different  from  that  of  guides  at  microwave  frequencies  that  may  resemble  them  su- 
perficially, For  example,  the  optical  slot  guide  discussed  here  behaves  quite  different- 
ly from  the  microwave  slot  guide;  it  is  actually  an  analogue  of  the  slot  waveguide  for 
acoustic  surface  waves. 


The  method  of  calculation  employed  by  us  in  determining  the  propagation  properties 
of  these  structures  is  the  transverse  resonance  procedure,  which  is  widely  used  at 
microwave  frequencies  and  which  was  in  fact  originally  developed  in  a microwave  con- 
text. This  method  can  provide  more  accurate  results  for  optical  waveguides  than  other 
procedures  which  have  so  far  been  used,  and  yet  it  is  simple  and  straightforward. 
Furthermore,  the  procedure  readily  takes  into  account  the  presence  of  geometrical  dis- 
continuities between  the  various  regions  comprising  the  waveguide  structure,  and  is 
capable  of  systematic  improvement  in  successive  steps,  to  the  accuracy  desired. 
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A.  The  Slot  Waveguide 

The  slotted  metal-clad  waveguide  shown  in  Fig.  1,  designated  here  as  the  "slot 

^ 13  4. 

waveguide,"  has  been  considered  previously  in  the  literature.  ’ ' This  waveguiding 

structure  was  first  proposed  and  analyzed  by  one  of  the  present  authors,  and  it  was 

3 

soon  afterwards  independently  proposed,  measured  and  analyzed  by  Japanese  workers,  ’ 

The  modes  which  propagate  may  be  divided  into  two  classes:  those  for  which  only 
an  electric  field,  or  only  a magnetic  field,  is  present  in  the  vertical  direction,  perpen- 
dicular to  the  dielectric  film.  These  modes  correspond  to  the  and  E modes,  re- 
spectively, described  by  Marcatili.^  We  are  concerned  now  with  the  former  class,  or 
polarization,  since  it  is  for  this  class  that  previous  analyses  are  incorrect. 

The  transverse  resonance  approach  to  the  solution  of  waveguiding  problems  pro- 
ceeds by  viewing  the  guiding  structure  transversely  and  identifying  the  modes  which 
exist  in  each  of  the  constituent  regions.  For  the  E-vertical  polarization,  it  is  well 
known  that,  at  the  lower  frequency  end  of  the  propagation  range,  only  one  mode  is 
present  in  the  unplated  region  whereas  two  modes  are  present  simultaneously  in  the 
plated  region,  one  of  them  being  the  "plasmon"^  mode.  These  modes  are  identified  as 
solid  lines  in  the  dispersion  plots  shown  in  Figure  3.  The  TM^  mode  is  the  one  which 
is  present  on  an  unplated,  or  unmetallized,  film  of  infinite  width,  whereas  the  TM^ 
mode,  where  the  subscript  m signifies  "metallized,  " and  the  "plasmon"  mode  P occur 
simultaneously  as  independent  modes  on  a plated,  or  metallized,  film  of  infinite  width. 
The  "plasmon"  mode  occurs  because  the  metal  behaves  like  an  overdense  plasma  at 
optical  frequencies,  and  this  mode  would  exist  even  if  the  dielectric  film  were  absent. 

It  is,  in  fact,  the  dominant  mode  for  this  polarization  (E  vertical),  and  it  cannot  be 
neglected;  for  H-vertical  polarization,  the  "plasmon"  mode  is  absent.  It  is  this 
"plasmon"  mode  which  was  omitted  from  previously-published  analyses  ’ for  the  E- 
vertical  polarization. 


Fig.  3.  Dispersion  curves  (Ugff  vs.  film  thickness),  shown  dashed,  for 
the  lowest  E-vertical  mode  in  the  slot  waveguide.  Curves  for 
the  constituent  modes  are  shown  solid  (the  vertical  spacing 
between  the  P and  the  TM^  curves  is  generally  greater  than  is 
shown  here). 
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1,  Qualitative  Explanation  of  Propagation  Behavior 

The  propagation  behavior  of  the  slot  waveguide  may  be  understood  by  referring 
to  the  transverse  equivalent  network  for  the  structure,  shown  in  Fig.  4,  together  with 
the  curves  in  Figure  3.  For  complete  guiding,  the  unplated  region  must  be  "slow"  and 
the  plated  regions  "fast";  the  fields  in  the  plated  regions  then  decay  exponentially  in  the 
transverse  direction.  By  inspection  of  the  solid  line  dispersion  curves  in  Fig.  3,  we 
observe  that  the  TM^  mode  is  faster  (smaller  value  of  than  the  TM^  mode.  This 

occurs  because  the  metal  plating  possesses  a negative  real  dielectric  constant  (with  a 
small  imaginary  part);  its  effect  is  therefore  to  lower  the  net  value  of  dielectric  con- 
stant for  the  combination  of  plating  plus  film  and  substrate  in  the  plated  regions. 
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1 
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Fig.  4.  Transverse  equivalent  network 
for  the  slot  waveguide. 

Thus,  if  the  P mode  could  be  neglected,  the  wave  would  indeed  be  purely  bound, 
with  exponential  decay  occurring  in  the  plated  regions.  Under  that  condition,  the  trans- 
verse equivalent  network  in  Fig.  4 would  reduce  to  a network  with  one  mode  inside  and 
only  one  mode  in  each  of  the  outside  regions.  The  resulting  numerical  values  for  the 
propagation  constant  would  be  almost  exactly  the  same  as  those  obtained  in  Reference  4. 
This  solution  is  shown  qualitatively  in  Fig.  3 as  the  curve  with  short  dashes,  labeled 
"without  P.  " Because  of  the  approximation  (only  one  mode  in  each  region),  the  curve 
does  not  exist  below  thickness  t^. 

In  the  more  accurate  representation  shown  in  Fig.  4,  the  P transmission  lines  in 
the  outer,  metallized  regions  are  included.  For  thicknesses  less  than  t^,  the  TM^  ■ 
transmission  lines  are  no  longer  present,  but  the  P lines  are,  so  that  propagation  can 
continue  down  to  thickness  tj,  the  cutoff  thickness  for  the  TM^  mode  by  itself.  Inclusion 
of  the  P mode  thus  shows  that  the  slot  waveguide  can  still  guide  energy  even  though  the 
film  thickness  is  smaller  than  t^.  This  solution  is  shown  by  the  line  with  longer  dashes, 
labeled  "with  P.  " It  is  also  seen  that  for  film  thicknesses  reasonably  above  t^  the 
earlier  solution  for  is  quite  accurate. 

However,  the  P mode,  as  seen  from  Fig.  3,  is  even  slower  than  the  TM^  mode. 
The  P transmission  lines  in  Fig.  4 are  therefore  above  cutoff,  with  the  result  that  some 
energy  must  be  leaking  transversely  as  the  wave  progresses  down  the  slot  guide. 
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The  attenuation  constant  for  the  slot  guide  when  the  P mode  is  included  should 
therefore  be  greater  than  the  value  computed  without  the  P mode  for  two  reasons;  the 
P mode  is  intrinsically  lossier  and  introduces  new  attenuation,  and  leakage  of  energy 
is  produced,  as  shown  above. 

2.  Numerical  Values  for  the  Ehspersion  and  Attenuation  Behavior 

Numerical  values  for  the  propagation  behavior  as  a function  of  certain  geometri- 
cal parameters  were  calculated  from  the  transverse  equivalent  network  shown  in 
Figure  4.  In  this  network,  the  transmission  line  parameter  values  are  known  rigorous- 
ly from  the  respective  infinitely -wide  plated  and  unplated  regions.  The  boxes,  which 
represent  the  coupling  between  the  various  transmission  lines,  are  treated  approximate- 
ly. Rigorously,  the  boxes  should  contain  contributions  from  the  continuous  spectrum, 
but  in  our  analysis  they  are  neglected  since  we  expect  their  influence  to  be  numerically 
negligible  for  the  very  wide  slots  generally  employed  in  integrated  optics.  Our  boxes 
actually  comprise  only  transformers  arranged  so  that  the  two  outside  transmission  lines 
on  each  side  are  in  series  with  each  other.  Such  transformer  turns  ratios  are  given 
by  simple  overlap  integrals. 

The  structure  chosen  for  analysis  consists  of  film  and  substrate  refractive  indices 
of  1.68  and  1.48,  respectively,  with  a silver  plating.  Silver  has  a dielectric  constant 
at  a wavelength  of  0.  633  p of  about  -l6-j  0.  5;  the  accurate  refractive  index  is 
0.  067- j 4.  040.  The  dispersion  curve  shall  be  phrased  in  terms  of  vs. 

t/X,  where  t is  the  film  thickness  and  X(=2ir/k)  is  the  free-space  wavelength.  When  the 
slot  is  wide  relative  to  wavelength,  the  dispersion  curve  is  very  close  to  that  for  the 
TM  mode,  corresponding  to  the  unplated  region  itself,  because  mos,t  of  the  field  is 
then  in  the  unplated  region.  In  order  to  obtain  a curve  which  deviates  from  the  TM^ 
values,  we  show  in  Fig.  5 the  results  for  a rather  narrow  slot,  for  W/X  = 2.0,  where 
W is  the  slot  width.  It  is  seen  that  the  behavior  follows  very  closely  that  predicted  by 
the  qualitative  discussion  above.  For  values  of  t/X  above  the  cutoff  for  the  TM^  mode, 
the  curve  (shown  dashed)  neglecting  the  P mode  yields  values  almost  identical  to  those 
for  the  accurate  calculation,  including  the  P mode.  It  should  be  noted  that  for  most 
situations  the  film  and  substrate  refractive  indices  will  be  closer  to  each  other  than  in 
this  calculation,  and  that  the  slot  widths  would  be  greater;  for  those  situations,  the  dif- 
ferences between  the  dashed  and  the  heavy  solid  curves  would  really  be  negligible. 

For  values  of  t between  the  cutoffs  of  the  TM^  and  TM^  curves,  on  the  other  hand, 
the  dashed  and  heavy  solid  curves  are  entirely  different,  in  that  the  dashed  curve  pre- 
dicts no  guided  propagation  at  all  in  this  film  thickness  range.  Thus,  the  inclusion  of 
the  P mode  in  the  analysis  shows  that  guided  propagation  can  occur  down  to  lower  film 
thicknesses  than  previously  believed.  The  slight  kink  appearing  in  the  curve  labeled 
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Fig.  5.  Dispersion  behavior  for  a slot  waveguide 
employing  a silver  plating,  expressing 
ng£f(=\/\„=  P/k)  as  a function  of  normalized 
film  thicKness,  t/\. 

"with  P"  at  the  cutoff  of  the  TM  mode  is  due,  we  believe,  to  the  neglect  of  the  con- 

m 

tinuous  spectrum  in  the  analysis. 

A curve  of  attenuation  q as  a function  of  slot  width  W is  presented  in  normalized 
form  in  Fig.  6,  when  the  refractive  index  values  are  the  same  as  those  in  Fig.  5 and 
the  film  thickness  is  given  by  t/K  = 0.  60,  in  the  range  for  which  both  outside  transmis- 
sion lines  exist  in  the  network  of  Figure  4.  It  is  seen  that  the  attenuation  drops  sub- 
stantially as  the  slot  width  is  increased,  which  is  expected  since  then  less  of  the  field 
is  present  in  the  metallized  regions,  and  that  the  inclusion  of  the  P mode  increases  the 
attenuation  by  more  than  an  order  of  magnitude.  For  very  narrow  strips,  the  attenua- 
tion is  seen  to  exhibit  a peak  and  then  a decrease,  as  a result  of  the  changing  partition- 
ing of  field  between  the  P and  the  TM^  modes  in  the  outside  (plated)  regions.  We  see 
from  Fig.  6 that  lower  attenuation  is  obtainable  when  the  slot  width  is  increased,  but 
that  nevertheless  the  inclusion  of  the  P mode  in  the  solution  introduces  a serious  in- 
crease in  a. 

The  variation  of  attenuation  with  film  thickness  t is  shown  in  Fig.  7,  for  two  dif- 
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higher  for  smaller  values  of  W.  We  also  observe  quite  clearly  that  the  solution  with 
P yields  substantially  higher  a values.  It  is  also  evident  that  the  solution  with  P con- 
tinues down  to  lower  values  of  t/\,  consistent  with  Fig.  5,  but  it  is  interesting  that  the 
a values  do  not  increase  in  this  range.  The  curve  of  attenuation  for  the  TM  mode 
alone  is  also  shown  (small  dashes);  it  is  seen  that  for  large  W/X.  values  it  is  easy  to 
achieve  attenuations  substantially  below  these  values. 

The  Metal-Strip  Waveguide 

The  structure  of  this  waveguide  was  shown  in  Fig.  2;  it  consists  of  a metal  strip 
on  a thin  film  on  a substrate.  The  metal  is  again  viewed  as  an  overdense  plasma. 

The  transverse  equivalent  network  for  this  simple  structure,  which  apparently 

has  not  been  analyzed  previously  or  recognized  as  a possible  optical  waveguide,  is 

shown  in  Figure  8.  For  complete  guidance,  the  TM^  mode  transmission  line  must  be 

below  cutoff;  this  requirement  is  satisfied  over  all  ranges  of  film  thickness  because  the 

P mode  is  always  slower  than  the  TM^  mode,  as  seen  from  Figure  3.  The  TM^  mode, 

on  the  other  hand,  is  even  faster  than  the  TM  mode,  so  that  its  transmission  line  is 

u 

also  below  cutoff.  Thus,  the  only  propagating  line  in  Fig,  8 is  the  P mode  line;  the 
wave  guided  by  the  metal-strip  structure  should  therefore  be  dominated  by  the  P mode. 
Since  the  P mode  is  present  only  for  E-vertical  polarization,  the  metal-strip  structure 
has  the  immediate  advantage  that  only  the  E-vertical  polarization  modes  can  be  gmded 
by  it. 


Fig.  8.  Transverse  equivalent  network 
for  the  metal-strip  waveguide. 

The  metal  strip  will  also  guide  for  thicknesses  smaller  than  tj  (and  even  if  the 
dielectric  film  is  completely  absent),  although  we  have  not  analyzed  that  range.  The 
metal  strip  waveguide  will  therefore  guide  completely  (no  leakage,  but  with  high  attenua- 
tion) over  all  values  of  film  thickness. 

A theoretically -calculated  plot  of  as  a function  of  t/\  for  this  guiding  struc- 
ture is  presented  in  Fig.  9 for  a silver  strip  and  the  other  parameters  chosen  previously. 
As  expected,  the  curve  resembles  that  of  the  P wave  itself,  so  that  the  effective  re- 
fractive index  for  the  wave  guided  by  the  strip  is  substantially  higher  than  that  for  the 
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Fig.  9.  Dispersion  curve  (n^ff  vs.  film 
thickness)  for  the  metal-strip 
waveguide  when  silver  is  used. 

unplated  region  alone.  As  a result,  the  wave  is  more  tightly  bound  to  the  strip  than  is 
the  case  for  other  optical  linear  waveguides.  A practical  consequence  of  this  tight 
binding  is  that  it  permits  the  waveguide  to  undergo  much  sharper  bends  before  the  leak- 
age radiation  becomes  significant.  This  problem  relating  to  bends  is  a serious  one  for 
integrated-optical  waveguides  because  th3  waveguide  is  usually  almost  the 

same  as  that  for  the  neighboring  regions,  resulting  in  very  weak  binding. 

Tight  binding  also  implies  that  most  of  the  wave  is  in  the  strip  region;  as  a con- 
sequence, we  would  expect  that  the  attenuation  resembles  clo.sely  that  of  the  P wave 
alone.  Alas,  calculations  bear  this  out;  furthermore,  the  attenuation  changes  little 
with  strip  width.  Since  the  P mode  is  very  lossy,  the  metal-strip  waveguide  possesses 
an  attenuation  value  which  for  most  applications  is  disqualifying.  This  aspect  is  very 
unfortunate,  because,  except  for  the  attenuation  property,  the  waveguide  possesses 
an  unusual  set  of  virtues.  It  may  be  possible  that  the  P mode  for  other  materials  at 
other  frequencies  possesses  much  lower  attenuation  values,  but  we  have  not  investigated 
this  feature. 

C.  Conclusions 

The  earlier  theoretical  analyses^’ for  the  slot  waveguide  are  correct  for  the  Id- 
vertical  polarization  modes,  but  incorrect  for  the  E-vertical  polarization  modes,  be- 
cause a constituent  plasmon  mode,  due  to  the  metal  plating,  was  omitted  from  the  anal- 
ysis. A transverse  equivalent  network  approach  which  includes  the  plasmon  mode  shows 
that  two  major  effects  occur; 
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(1)  The  wave  can  be  guided  over  a larger  range  of  film  thicknesses  than 
previously  thought;  guiding  can  occur  down  to  the  cutoff  thickness  of 
the  unplated  region. 

(2)  The  wave  is  not  completely  guided,  as  previously  thought,  but  leaks 
energy  transversely.  This  leakage,  plus  the  higher  intrinsic  attenua- 
tion of  the  plasmon  mode,  increases  the  attenuation  of  the  guided  wave 
by  more  than  an  order  of  magnitude  over  earlier  calculations. 

Because  of  the  higher  attenuation  possessed  by  the  E-vertical  polarization  modes 
of  the  slot  waveguide,  they  should  be  avoided  whenever  possible  and  the  H-vertical 
modes  should  be  used  instead.  The  plasmon  mode  is  not  a constituent  of  the  H-vertical 
modes,  so  that  their  attenuation  values  are  significantly  lower  than  those  of  the  E- 
vertical  modes  for  the  same  slot  widths.  Fortunately,  the  modulator  or  switch  applica- 
tions projected  for  the  slot  waveguide,^  which  involve  an  rf  or  dc  electric  field  between 

the  plated  regions,  automatically  requires  the  use  of  the  H-vertical  optical  mode  (or 

X 5 

E mode  in  Marcatili's  notation  ) for  optimum  interaction. 

The  metal-strip  waveguide  is  a simple  structure  which  has  previously  not  been 
analyzed  or  recognized  as  a possible  optical  waveguide.  Its  only  disadvantage,  which 
is  a very  serious  one,  is  its  unusually  high  attenuation,  about  that  of  the  plasmon  mode 
by  itself.  This  is  very  unfortunate  because  the  waveguide  possesses  several  unique 
advantages,  in  addition  to  its  very  simple  structure.  Its  unique  advantages  are: 

(1)  It  guides  completely  (no  leakage),  and  it  guides  only  for  the  E-vertical 
polarization,  thus  eliminating  a whole  set  of  higher  modes  and  the  usual 
cross -polarization  problems. 

(2)  Its  wave  is  very  tightly  bound,  thus  permitting  much  sharper  bends  and 
overcoming  one  of  the  principal  problems  facing  the  use  of  integrated- 
optical  waveguides. 

(3)  It  will  guide  for  all  values  of  film  thickness,  and  even  without  the  presence 
of  a film . 

Perhaps  sufficiently  low  attenuation  values,  permitting  this  unique  waveguide  to  become 
practical,  may  be  possible  with  other  materials  at  other  frequencies. 

Joint  Services  Technical  Advisory  Committee 
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RAY  OPTICAL  CALCULATION  OF  EDGE  DIFFRACTION  IN  UNSTABLE  RESONATORS 
L.B.  Felsen  and  C.  Santana 


1 Z 

By  a fairly  recent  generalization,  ’ it  has  been  shown  how  high  frequency  ray 
optical  techniques  can  be  adapted  to  the  analysis  of  scattering  by  localized  discontin- 
uities (small  obstacles,  edges,  etc.)  in  waveguides  or  ducts  filled  with  homogeneous 
or  inhomogeneous  dielectric  media.  Basic  to  the  technique  is  the  ray  optical  formula- 
tion of  the  waveguide  Green's  function,  i.e.  , the  radiation  from  a source  with  isotropic 
radiation  pattern.  This  is  then  generalized  to  non-isotropic  sources  with  radiation 
pattern  f(0)  where  6 is  the  angle  measured  from  the  waveguide  axis  y.  (Although  three- 
dimensional  problems  can  be  treated  by  this  method,  we  consider  here  only  the  two- 
dimensional  z-independent  case.  ) A localized  discontinuity  may  be  characterized  by  its 
free-space  diffraction  pattern  f (0,  0^)  when  the  incident  field  is  a uniform  plane  wave 
impinging  from  the  direction  0^.  When  the  discontinuity  is  placed  inside  the  waveguide 

and  illuminated  by  an  incident  waveguide  mode,  which  can  locally  be  decomposed  into 

+ _ •+ 

uniform  plane  waves  with  characteristic  angles  0^  , the  resulting  f(0,0T)  constitutes 
an  equivalent  non-isotropic  source  whose  excitation  of  modal  fields  may  be  calculated 
from  the  solution  referred  to  above.  In  this  manner,  one  derives  by  ray  optical  tech- 
niques the  modal  reflection,  transmission  and  coupling  coefficients  (i.e.,  the  scatter- 
ing matrix  elements)  for  a discontinuity  inside  the  waveguide.  The  lowest-order , single 
diffraction  solution  so  obtained  may  be  refined  by  accounting  for  multiple  diffraction 

effects  due  to  interaction  between  the  singly  diffracted  fields  and  the  waveguide  bound- 

Z • 6 

aries.  For  details  of  the  method,  the  reader  is  referred  to  previous  work. 


The  ray  optical  technique  has  already  been  applied  to  the  study  of  discontinuities 

6 7 

in  waveguides  of  various  types.  ’ In  the  present  report,  it  is  shown  how  it  can  be 
applied  to  the  important  problem  of  unstable  open  optical  resonators.  Because  of  their 
good  mode  selectivity  and  large  mode  volume,  such  structures  appear  to  be  most 
promising  for  use  with  laser  sources  of  high  and  even  moderate  gain.  By  recent  studies 
performed  independently  in  the  United  States^’  ^ and  the  Soviet  Union, it  has  been 
shown  that  the  unstable  resonator  can  be  regarded  as  a waveguide  whose  boundaries  are 
the  convex  resonator  mirrors  and  whose  axis  is  transverse  to  the  resonator  axis 
(Figure  1).  Resonance  in  this  open  waveguide  is  established  by  self-consistent  reflection 
of  a propagating  waveguide  mode  between  the  edge  discontinuities  formed  by  the  rims  of 
the  mirrors.  Although  the  waveguide  is  very  strongly  overmoded,  it  has  been  shown 
that  near  the  cutoff  condition,  which  is  of  interest  for  the  resonator  problem,  mode 
coupling  due  to  the  mirror  edges  is  confined  essentially  to  adjacent  modes.  Thus,  a 
very  simple  model  involving  selective  coupling  between  two  waveguide  modes  has  been 
developed  and  has  been  found  capable  of  explaining  the  intricate  eigenmode  loss  behavior 
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Fig.  1.  Unstable  resonator  with  hyperbolic  mirrors.  The  resonator 
axis  lies  along  x and  the  waveguide  axis  along  y.  A typical 
modal  caustic,  an  ellipse  with  foci  at  x=  ±d,  is  shown,  together 
with  one  congruence  of  upgoing  modal  rays.  Ray  A strikes 
the  upper  edge,  and  6t  is  the  propagation  angle  of  the  corre- 
sponding local  plane  wave  field  in  the  j-th  mode.  A similar 
congruence  of  downgoing  modal  rays  has  been  omitted,  as  has 
the  corresponding  picture  for  the  left  half  of  the  resonator. 

For  very  slender  caustics,  all  modal  rays  appear  to  originate 
at  the  foci.  The  edges  of  the  mirrors  are  located  at  ± y , JI 
or  ^ in  the  various  coordinate  systems  defined  in  the  text;  the 
analogous  designation  for  the  modal  caustic  is  or 

9 10 

determined  by  numerical  solution  of  the  resonator  integral  equation.  ’ While  the  role 
of  mode  coupling  has  been  alluded  to  in  the  Russian  work,  ^ it  has  not  been  incorporated 
into  their  analysis.  The  Soviet  calculations  are  based  on  a single  mode  analysis,  which 
is  adequate  only  near  eigenmode  loss  minima  and  does  not  provide  the  peculiar  inter- 
connections between  successive  loss  minima  found  in  the  numerical  results. 

9 10 

A further  attribute  to  our  analysis  ’ is  the  avoidance  of  the  resonator  integral 
equation,  which  forms  the  basis  of  the  Soviet  approach  to  the  waveguide  problem  as 

17 

originally  formulated  by  Weinstein  “ and  followed  thereafter  by  others  in  the  Soviet 
11  13 

Union.  ’ By  avoiding  the  integral  equation,  it  is  possible  to  decompose  the  unstable 
resonator  problem  into  conventional  microwave  network  constituents  involving  propa- 
gation (waveguide)  and  discontinuity  regions.  By  this  separation,  one  may  extend  the 
analysis  also  to  resonator  configurations  which  are  filled  with  inhomogeneous  and(or) 
active  materials,  and  to  mirror  shapes  which  depart  from  the  conventional  circular 
contours . 


While  the  ray  optical  principle  of  localization  is  consonant  with  the  microwave 
network  approach,  the  reflection  and  coupling  coefficients  due  to  the  mirror  edges  were 
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previously  ’ not  calculated  by  the  ray  optical  method  described  earlier.  Instead, 

12 

these  coefficients  were  taken  from  Weinstein  by  modeling  the  region  near  the  edges 
locally  as  an  open-ended  parallel  plane  waveguide.  Since  the  ray  optical  method 
synthesizes  the  reflection  and  coupling  coefficients  by  direct  edge  scattering,  it  is  of 
interest  to  examine  whether  the  two  procedures  yield  the  same  result.  This  is  especial- 
ly important  for  resonators  with  moderately  large  Fresnel  numbers  where  the  local 
parallel  plane  approximation  near  the  edges  is  more  difficult  to  justify  since  the  slant- 
ing of  the  convex  waveguide  boundaries  is  then  not  negligible.  (The  Fresnel  number  is 
defined  as  N = ky  /4itL;  k is  the  wavenumber  and  y and  L are  given  in  Figure  1.)  It 
will  be  shown  that  the  single -diffraction  ray-optically  evaluated  reflection  and  coupling 
coefficients  are  identical  with  those  obtained  by  Weinstein  from  the  rigorous  solution 
of  the  open-ended  parallel  plane  waveguide  problem  when  the  characteristic  angle  of 
the  incident  mode  is  not  almost  90°  with  respect  to  the  waveguide  walls:  this  is  the  range 
of  interest  for  the  moderately  large  Fresnel  number  regime.  This  confirmation  then 
suggests  that  the  single  diffraction  ray  optical  model  may  be  used  with  confidence  for 
large  and  very  large  Fresnel  numbers  where  the  local  parallel  plane  approximation  is 
clearly  in  doubt.  The  reflection  and  coupling  coefficients  derived  here  may  then  be 
regarded  as  more  reliable  than  any  of  those  available  heretofore.  Although  the  ray 
optical  edge  diffraction  mechanism  has  been  proposed  as  an  explanation  of  the  numerical- 
ly observed  eigenmode  loss  behavior this  fact  has  not  been  incorporated  into  a 
systematic  modal  theory.  Such  an  incorporation  is  performed  here. 

A.  The  Waveguide  Green's  Function 

We  seek  a solution  of  the  equation 

(V^  + k^)  G{  p,  p')  = -6(  p - p')  , p = (p,  q ) (1) 

subject  to  the  boundary  conditions 

= 0 at  p = 0 (la) 

G = 0 at  q = ± q J , (lb) 

and  a radiation  condition  at  p -»  oc.  A time  dependence  exp(-iut)  is  suppressed.  Here, 
p and  q are  constant  coordinate  surfaces  in  an  elliptic  coordinate  system  (Figure  2). 

The  boundary  condition  in  Eq.  (lb)  identifies  G as  the  single  component  electric  field 
E = E^  and  the  source  as  a suitably  normalized  line  of  z-directed  electric  currents. 

Since  we  shall  be  interested  only  in  field  solutions  which  are  even  with  respect  to  the 
y = 0 plane,  the  boundary  condition  in  Eq.  (la)  has  been  imposed  and  effectively  bisects 
the  waveguide. 


■ 1. 


94 


OPTICS 


magnetic 
wall 


Fig.  2.  Bisected  wavegiaide  configuration. 

In  terms  of  waves  propagating  along  the  y (or  fi)  directions,  the  Green's  function 
can  be  represented  as: 

4>„(h  ) ^^{ti  ') 

1? 


G(  p,  p')  = ^ ^ 


(2) 


where 


^n  ^ / *n(n)dh 
-^1 


(2a) 


is  the  squared  normalization  constant  for  the  eigenfunctions  The  latter  satisfy 


the  source-free  one -dimensional  equation 


h^(b^  - sin^q)^  ^n^^^  " ° ^ 

dn 


(3) 


with  b^  representing  the  modal  eigenvalue  and 


$^(±n^)  = 0 


(3a) 


The  one -dimensional  Green's  function  gjj(|i.  P-')  satisfies  the  source-excited  equation 

.2 


[-^  + h^(cosh^ft  - b^)j  g^(n,  p')  = -6(n  - fi') 

dfX 


(4) 


with 


-T — =0  at  (i  = 0,  radiation  condition  at  |i  -+  <» 


dfi 


(4a) 


The  eigenfunctions  were  determined  previously.  Assuming  that  the  turning 
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points  Ti  = I sin' lie  outside  the  domain  -ri^  < r|  < (this  is  the  case  for  the  modes 
of  interest),  and  since  h is  large,  one  may  employ  the  WKB  approximations  for 


r 1 

sin|^h  j 4'„(T)dTj 


, / » /u2  -2  ,1/2 

, v|j^(ti)  = (b^  - sin  r\) 


with  b defined  by  Eq.  (3a)  via 
n 


h f + (T)dT  = mr  , n = integer  (5a) 

-’ll 

9 

or  approximately  as, 
n k In  M 

where 

M = 1 + 2y  + Z/S'  , a = Y + Y , Y = L/r  (5c) 

M is  the  linear  magnification  and  Y is  the  ratio  of  the  resonator  half  length  L measured 
along  the  x axis  to  the  radius  of  curvature  r of  the  mirrors  on  this  axis.  When  the  sine 
function  Eq.  (5)  is  expressed  as  the  sum  of  two  exponentials  and  the  result  then  substitut- 
ed into  Eq.  (2a),  one  observes  that  two  of  the  resulting  integrals  contribute  negligibly 
because  of  rapidly  fluctuating  integrands.  Thus, 


2 1 r dT  . InM  „2  ,,,lnM 


-’ll  n 

The  evaluation  of  the  integral  in  Eq.  (6)  is  based  on  the  approximation  b^  =»  1 , which 
applies  to  the  modes  of  interest  (see  Reference  9). 

The  modal  Green's  function  g„((i>  M 1^®  constructed  in  terms  of  solutions 

which  satisfy  the  source-free  Eq.  (4)  and  the  boundary  conditions  at  ft  = 0 and 
respectively.  ^ Introducing 

; = (2h)^/^fx,  i = (2h)‘/2g.  P„  = I 

where  u locates  the  turning  point  (modal  caustic)  of  the  approximated  differential 
“ cn 

equation  (4),  one  may  write 

(-^+  ^ - P„)  g„(C.  ;')  = -t') 
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with  boundary  conditions  corresponding  to  Equation  (4a).  The  solution  for  is: 


irp„/2  ip„  r(i+i^) 


g = ® 
®n 


(2  ") 


r(i-i^) 


{d  (z^)  + i(2 


1 


-ip.  r(i-i^) 

p -v-1  < ■>  v > 

(9) 


_4_ 

r(i+i^) 


where 


z = , V = -l/2  - ip 


(10) 


For  observation  points  ^ ” ^|Pn 


1/2 


i.e,  , far  enough  from  the 
modal  caustic  at  may  employ  the  WKB  approximations  for  the  parabolic  cylinder 

functions  to  obtain 


exp[i  j tpj^(|)d|]  exp[i  f :p^(4)d|  + Zi  J ^ cp^(|)d4] 


-2i[cp„(;)v  (;')]^^^  ^ " 


-2i[:p^(;) 


(11) 


where 


(0=(^-p 


(12) 


and 


B 


exp(  -itr  / 4)  *4  ^2 


r(T+i^) 


-ip. 


r(i-i^) 


p exp[ip^(l  - Inp^)] 


(13) 


The  first  term  in  Eq.  (11)  represents  the  direct  contribution  from  the  source  point  at 
to  the  observation  point  at  while  the  second  term  describes  a wave  that  has  traveled 
from  the  source  point  to  the  modal  caustic  and  thence  to  the  observation  point.  The 
reflection  coefficient  due  to  the  caustic,  as  seen  from  the  observation  point,  is  given 
by  the  ratio  of  the  reflected  and  incident  waves  as: 


K (;)  - exp[2i  J cp  (|)d4l 
n n n 


(14) 


, 10 

which  has  been  given  previously. 

When  the  results  in  Eqs.  (5)  and  (11)  are  substituted  into  Eq.  (2),  one  obtains  the 
WKB  approximated  Green's  function  for  ^ 


r 
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sin[h  J 4>„(T)dT]  sin[h  J ^ d'r'} 

G > y n n 

n N^[4)n(ri)4jn(’1  ^ 


exp(-i  j ^ cp^(|)d|)+  exp[-i  «Pj^(4)d|  + 2i  / cp^(|)d|] 

^ ^ ^cn 

B.  Response  to  a Directive  Source 

The  Green's  function  in  Eq.  (15)  represents  the  field  excited  by  a line  source  of 
strength 

Godp-  p'l)  = ^J^0  VIp-  p'l)  - — )^'^^exp[ik]  p - p'l  - iTr/4]  (16) 

irk|  p - p I 

When  the  source  has  a radiation  pattern  f(6)  so  that  its  far  field  is 

Gp-GQf(e)  (17) 

the  response  in  the  waveguide  can  be  calculated  by  decomposing  the  source-point  depend- 
ent eigenfunction  into  its  local  plane  wave  constituents  and  multiplying  these  by 

/ i / 4- 

f(ir  + 6^  ),  where  6^“  are  the  propagation  angles  of  the  local  plane  waves  in  the  n-th 
mode  at  the  source  point.  Thus,  the  field  G at  ^ produced  when  the  directive  source 
in  Eq.  (17)  is  placed  inside  the  waveguide  is  given  by  Eq.  (15)  provided  that  one  makes 
the  replacement 


hi  hj 

iin[h  j ^ 4<^(T)dT]  -♦  (2i)  ^ |f(Tr  + 0^")exp[ih  j ^ 4Jjj(T)dT] 


- f(TT  + e'^)  exp[-i  hf  ijj(T)dT]f  (18) 

n / n J 

If  the  source  is  located  on  the  lower  wall  or  the  upper  wall,  one  puts  f(ir  + 0^^)  = 0 and 

/ * z ^ 

f(TT  + ) = 0,  respectively  ; the  pattern  functions  now  represent  the  respective  far 

zone  fields  of  the  source  in  the  presence  of  the  boundary  whereon  it  is  situated. 

C.  Reflection  Coefficient  for  the  Open-Ended  Waveguide 

1.  General  Formulation  for  Large  Fresnel  Numbers 

When  the  waveguide  in  Fig.  2 is  truncated  at  reflection  occurs  from  the 

open  end.  By  the  ray  optical  method,  the  modal  reflection  and  coupling  coefficients 
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are  calculated  from  the  single  edge  diffraction  patterns.  This  requires  first  a deter- 
mination of  the  local  plane  wave  fields  that  illtiminate  the  edges.  Then,  one  calculates 
the  diffraction  field  due  to  the  upper  edge,  observed  at  an  angle  P with  respect  to  the 
tangent  plane  at  the  edge.  This  provides  the  strength  and  pattern  of  the  equivalent  line 
source  in  Eq.  (17),  which  is  then  inserted  into  Equations  (18)  and  (15).  Omitting  the 


term  multiplied  by  in  Eq.  (15),  which  represents  fields  non-relevant  for  edge  reflect- 


tion  (they  arrive  after  a round  trip  to  the  center  of  the  resonator),  one  obtains  for  the 


total  reflected  field  E . due  to  an  incident  mode  j: 
rj 


^rj  } sin[h/  4>^(  t)  d ^ ] exp  [-i  / (19) 


jn 


’cn 


where 


r = 


[l  + (-l)"+j]V(p,.,  p ) 

—iL. 


8i(2h)‘^^[^^(Tij)  4^.(qj)  cp^(  ;)  tp.(U]^^^N^N. 


exp[i  f cp  (4)d4  +i  J cp  (4)d|] 


(20) 


'c3 


Here,  is  the  coupling  coefficient,  due  to  diffraction  at  both  edges,  from  the  incident 


mode  j to  mode  n,  and  V(P^,  P^)  is  the  edge  diffraction  coefficient  for  a perfectly  reflect- 


ing half  plane  illuminated  by  a unit  strength  plane  wave  incident  at  the  angle  P^  and  ob- 


served at  the  angle  P^; 


V(P^,  P^)  = -sec 


P^  + P. 


+ sec 


2-^' 


(21) 


p^  and  P^  denote  the  characteristic  propagation  angles  of  the  local  plane  waves  that  com- 


prise modes  n and  j. 

2.  Simplification  for  Moderate  Fresnel  Numbers 


To  check  whether  the  general  coupling  coefficient  in  Eq.  (20)  reduces  to  the  local 
parallel  plane  approximation  for  small  enough  Fresnel  numbers  and  in  particular  for 
weakly  curved  mirrors  (small  Y in  Eq.  (5c)),  the  limiting  form  of  Eq.  (20)  in  that  para- 
meter range  is  now  examined.  As  noted  earlier,  the  modes  of  interest  are  those  with 
caustics  near  the  center  of  the  resonator.  Thus,  the  edges  may  be  taken  to  lie  far 


enough  from  the  caustics  to  justify  ^ « p.  Moreover,  p itself  is  small.  When  these 


approximations  are  introduced,  one  finds; 
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4jj.  cos  T). 

V(P  , P ) a-  (sgn  P )(sgn  P ) —-2 2 rz r 

" J J(2fi  + 1 - b^)  + (2ji  + 1 - bp 


In  M *3  2 Y , a « V , 

n.j 


(22) 


(23) 


«,  (p2  . = (2/h)*^^  Cfpc-) 

2fi 


,1/2 


Thus,  Eq.  (20)  reduces  to,  when  the  phase  reference  is  at  the  edges. 


(24) 


(25) 


and 


(26) 


1 2 

These  expressions  agree  with  the  ,'arallel  plane  formulas  of  Weinstein  for  the  case 
where  P^  and  p^  differ  sufficiently  from  ■tr/2  to  ensure  that  one  edge  does  not  lie  in  the 
reflected  ray  transition  region  of  the  other.  In  that  event,  single-edge  diffraction  is 
adequate  to  describe  reflection  from  the  open-ended  parallel  plane  structure.  As  noted 

Q 

previously;  s^  represents  the  modal  propagation  coefficient  in  the  equivalent  parallel 
plane  waveguide  whose  height  equals  that  of  the  hyperbolic  waveguide  at  Q. 


When  Y is  reduced  further  so  that  one  edge  does  lie  in  the  reflected  ray  transition 

region  of  the  other  edge,  interaction  between  the  edges  in  the  local  parallel  plane 

model  cannot  be  ignored.  The  single-edge  diffraction  function  must  now  be  replaced  by 

a more  accurate  function  derived  from  the  rigorous  solution  of  the  semi-infinite  parallel 

20 

plane  configuration.  The  result  as  given  by  Weinstein,  modified  to  account  for  the 
different  normalization  of  the  incident  field  used  here,  is  as  follows: 


r.  = -i  [(s  + s .)(s  s.) 
jn  n j”  n j' 


1/2. -1 


]■  exp[U(s 


6)  + U(s.,  6)1 


(27) 


where  6 is  defined  by  any  of  the  equalities 


2ir(i  + 6)  h,,^2  ,,  nir-  2kL 

Tn  M—  = Pn=  2 = -lnM~ 


j = 0,  ±1,  ±2, 


(28) 


The  diffraction  function  U(s,  6)  is  discussed  in  detail  in  Ref.  20  and  has  the  following 
asymptotic  behavior; 

U(s,  6)  - 0 , s large  (29) 

U(s,  6)  - j(ln2  + ^)  + ln(2s)  - ps  + • • ■ , s small  (30) 
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where  P = 0.824.  Thus,  Eq.  (27)  reduces  to  Eq.  (25)  when  s , are  sufficiently  large 

u , j 

(note  that  this  condition  can  be  met  although',  is  small).  For  small  s one  has 


r.  = -2(s  + s.)  ^ (s  s.)^^‘‘  exp[-(l  -i)  ^(s  + s.)*^^]  (31) 

jn  n j n j ' n j 

Q 

which  (noting  that  s^  s^  here)  was  used  in  the  analysis  by  Chen  and  Felsen.  With 
formulas  (20)  and  (27),  noting  the  overlap  region  in  Eq.  (25),  one  may  cover  the  entire 
range  of  parameters  from  small  to  large  Fresnel  numbers  for  the  hyperbolic  mirror 
resontaor. 


TMs  completes  the  objective  stated  at  the  beginning  of  this  report. 
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ANALYSIS  AND  EXPERIMENTAL  EVALUATION  OF  LOSSY  GRATING  COUPLERS 
S.  Austin,  S.T.  Peng,  F.T.  Stone  and  T.  Tamir 

Much  theoretical  work  has  been  published  on  optical  grating  couplers,  but  the 
experimental  verification  of  these  results  has  not  kept  pace  with  this  theoretical  prog- 
ress. We  therefore  discuss  here  the  outcome  of  a joint  theoretical  and  experimental 
effort  wherein  the  performance  of  lossy  grating  couplers  has  been  studied  by  combin- 
ing analytical  methods  with  laboratory  measurements.  In  particular,  we  have  evaluated 
the  performance  of  grating  couplers  made  holographically  in  Shipley  AZ  1350B  photo- 
resist, which  may  incur  substantial  losses  in  coupling  efficiency  if  they  are  not  prop- 
erly designed.  However,  we  have  found  that  careful  design  and  fabrication  can  mini- 
mize scattering  and  absorption  losses  so  as  to  realize  coupling  efficiencies  that  are 
close  to  those  predicted  by  theoretical  considerations. 

To  produce  the  desired  periodic  structure  for  the  grating -coupler  device,  we  use 
an  appropriate  initial  photoresist  thickness  obtained  from  calibration  curves  of  resist 
thickness  vs.  spinner  speed  and  resist  concentration.^  Then  the  experimentally  deter- 
mined etch  depth  vs.  exposure  characteristic  can  be  used  together  with  the  linear 

2 

dependence  of  etch  depth  on  development  time  to  construct  the  coupler.  It  is  helpful 
to  computer -generate  families  of  grating  profiles  as  a function  of  the  various  para- 
meters. The  profile  nearest  the  desired  one  can  then  be  selected,  thereby  fixing  the 
experimental  variables.  The  above  procedure  has  been  verified  by  comparing  meas- 
ured and  predicted  values  of  grating  diffraction  efficiencies. 

The  finished  grating  structure  generally  exhibits  a five -layer  configuration  as 
shown  in  Fig.  1,  i.e.  , it  consists  of  a substrate,  a waveguide  (film)  of  thickness  t^,  a 


I 


Fig.  1.  Five-layer  grating  structure. 


residual  photoresist  layer  of  thickness  t^,  a grating  having  a height  t^  and  an  upper  air 

region.  For  a variety  of  reasons,  it  is  important  to  control  the  grating  profile  and, 

in  particular,  to  achieve  t = 0.  Thus,  to  obtain  the  highest  available  coupling  efficicn- 

^ 3 

cy  for  incident  beams  having  Gaussian  profiles,  it  is  well  known  that  the  grating  and 
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beam  parameters  must  satisfy  aw^=  0.6»,  where  a is  the  imaginary  (leakage)  part  of 
the  propagation  constant  and  is  the  beam  half-width  measured  along  the  waveguide. 

As  the  value  of  a depends  on  all  the  geometrical  and  physical  parameters  of  the  grating 
configuration  in  Fig.  1,  their  judicious  choice  is  important  if  a practical  beam -width 
satisfying  w^=  0.68/a  is  desired.  In  addition,  the  absorption  losses  in  the  photoresist 
are  expected  to  degrade  the  performance  of  the  grating  coupler.  As  the  photoresist  is 
required  essentially  only  to  provide  a periodic  profile,  we  expect  that  the  presence  of 
a residual  layer  of  thickness  t^  could  have  a deleterious  effect  on  performance. 

To  both  determine  a suitable  value  for  a and  to  verify  the  conjecture  that  = 0 
is  an  optimal  condition,  we  have  examined  the  propagation  characteristics  of  the  five- 
layered  structure  by  using  a perturbation  procedure,  the  accuracy  of  which  was  veri- 
fied by  a rigorous  approach.^  The  .calculations  thus  obtained  have  guided  us  in  design- 
ing the  grating  structures  that  were  subse^quently  used  in  the  experimental  studies. 
Furthermore,  these  calculations  have  confirmed  the  expectation  that  the  presence  of 
any  residual  resist  layer  does  indeed  degrade  coupler  performance.  We  also  found 
that  this  degradation  is  especially  strong  when  the  waveguide  refractive  index  n^  is 

smaller  than  the  real  part  n'  of  the  refractive  index  n of  the  photoresist  material,  in 

S ^6 

agreement  with  the  observations  of  Dalgoutte  and  Wilkinson. 

In  fact,  because  of  the  presence  of  non-negligible  losses,  our  theoretical  con- 
siderations show  that  the  significant  decay  parameter  is  no  longer  the  leakage  constant 
Q.  Instead,  we  must  use  an  effective  (total)  parameter  where  repre- 
sents decay  due  to  absorption  losses.  Thus  when  is  comparable  in  magnitude  to 

a,  the  optimum  value  of  w^  for  maximum  coupling  efficiency  may  be  appreciably  dif- 
ferent from  the  value  determined  by  the  relationship  iw^=  0.68. 

To  confirm  these  considerations,  we  have  fabricated  a number  of  grating  couplers 
by  following  the  procedures  described  above  and  have  subsequently  compared  the 
leakage  and  coupling  performance  to  that  predicted  theoretically.  Using  the  gratings 
as  output  couplers,  can  be  obtained  from  a scan  of  the  output  beam  profile.  The 
optimum  beamwidth  w^  can  alternatively  be  obtained  by  using  the  grating  as  an  input 
coupler  and  examining  the  coupling  efficiency  as  a function  of  beamwidth.  The  formula 
a = 0.68/w^  yields^  a second  value  for  q^.  A close  correlation  has  been  found  between 
these  two  experimental  values  of  and  that  predicted  from  theory. 

Gratings  were  intentionally  fabricated  with  various  residual  resist  layers  to  study 
the  effect  of  loss.  Table  I gives  the  results  for  several  samples.  From  this  table, 
we  can  draw  the  following  conclusions: 


104 


OPTICS 


TABLE  1 


t 

g 

(h 

t 

r 

(h 

Power 

Split 

air(q^) 

sub(rig) 

Pe 

Effic 

(Th) 

ak 

.ency 

(Exp) 

w 

o 

a 

(Theory) 

(Th) 

a^(>in 

(Exper 

(eff) 

n)'^ 

iment) 

(scan) 

800 

3000 

n^-.47 

q^=.53 

37 . 6% 

31  % 

79 

. 0035 

.0076 

. 0086 

. 0081 

2100 

1700 

■n  = . 50 
a 

Ti  = . 50 
‘s 

40  % 

33.5% 

102 

.0059 

.0075 

, 0067 

. 0082 

3100 

700 

n =.  47 
*a 

ri  =.53 
's 

37.6% 

37  % 

103 

.0072 

.0077 

.0066 

* 

not  suitable  as  output  coupler 

Note:  d=3160A,  X=  6328  A,  n"  = . 001 

g 

(1)  The  measured  values  of  agree  with  those  calculated  if  the  imaginary  part 
of  the  resist  index  is  taken  to  be  n"  = 0.  001 

g 

(2)  As  the  thickness  of  the  residual  resist  layer  becomes  smaller,  the  effect  of 
loss  becomes  negligible,  i.e.  , loss  in  the  grating  itself  is  apparently  not  a 
factor 

(3)  The  value  of  a^.  can  be  obtained  by  adding  the  absorption  loss  factor  ob- 

tained from  an  analysis  of  the  uniform  five-layer  waveguide  to  the  leaky  wave 
a obtained  from  a lossless  analysis  of  the  grating  structure. 

Input  coupling  efficiencies  and  the  power  split  between  the  two  beams  at  an  output 
coupler  also  agreed  quite  well  with  predicted  results.  Power  split  factors  of  50%  - 50% 
were  usually  expected  and  measured.  Using  this  value  of  power  split,  an  input  coupling 
efficiency  of  0.  5x80%  = 40%  is  predicted.  Values  around  35%  were  obtained  using  the 
dip  method. 

In  conclusion,  we  have  demonstrated  a procedure  for  fabricating  gratings  in 
photoresist  to  pre -determined  specifications.  The  theoretically  expected  and  experi- 
mentally measured  properties  agree  well  if  the  effect  of  photoresist  loss  is  included. 
The  effect  of  loss  is  due  almost  entirely  to  the  residual  resist  layer  and,  for  resist 
refractive  index  larger  than  film  index,  the  effect  becomes  appreciable  if  a residual 
layer  is  present  even  though  its  thickness  t^  is  equal  to  only  a small  fraction  of  the  film 
thickness  t^. 
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EFFECTS  OF  IMPERFECTIONS  ON  THE  EFFICIENCY  OF  OPTICAL  COUPLERS 
D.W.  Fradlin,  P.K.  Cheo,  S.T.  Peng  and  T.  Tamir 

The  efficient  operation  of  thin-film  devices  requires  that  a substantial  fraction  of 
the  input  laser  power  be  coupled  through  the  waveguide  and  into  a useful  output  beam. 
When  high-quality  waveguide  structures  are  used,  efficient  coupling  can  be  achieved 
with  either  a prism  or  a grating  coupler.  Work  with  free-standing  (slab)  waveguides 
used  in  an  early  model  of  a thin-fikn  modulator  for  CO^  laser  radiation^  has  shown 
both  that  practical  waveguide  structures  can  have  serious  structural  imperfections  in 
the  coupling  regions  and  that  device  constraints  may  preclude  the  use  of  optimized  wave 
guide  and  coupler  designs. 

A theoretical  study  of  the  effects  of  structural  imperfections  on  the  performance 
of  grating  and  prism  couplers  has  been  conducted.  This  study  was  motivated  by  experi - 
merits  with  symmetric  GaAs  waveguides  that  were  thinned  by  chemo-mechanical  and 
ion-mill  techniques.  It  is  shown  that  structural  imperfections  such  as  waveguide 
thickness  variations  in  the  coupling  region  and  grating  groove  depth  variations  can  sig- 
nificantly reduce  the  coupling  efficiency.  The  partition  of  energy  between  two  beams 
above  and  below  the  gratings,  which  represents  a serious  loss  mechanism  for  the  grat- 
ing coupler,  is  analyzed  for  the  symmetric  GaAs, waveguide.  The  manner  in  which 
thickness  variations  in  the  coupling  region  alter  this  partition  is  discussed.  Data  which 
tend  to  confirm  the  predictions  of  the  theory  are  summarized,  and  the  results  of  a 
direct  experimental  comparison  between  grating  and  prism  couplers  are  given  (see 
Figure  1).  It  is  shown  that,  for  the  free-standing  modulator  waveguides  the  prism 


(a)  grating  couplers,  (b)  prism  input  coupler 
Fig.  1.  Free-standing  (slab)  waveguide. 
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coupler  is  superior  because  it  leads  to  both  enhanced  coupling  efficiency  and  enhanced 
output  beam  quality.  This  work  establishes  fabrication  tolerances  for  the  coupler 
structures . 

The  analysis  of  coupler  performance  is  based  largely  on  two  computational  tech- 

2 

niques.  The  first  technique  involves  the  use  of  the  reciprocity  theorem,  and  the  second 

3 

involves  a perturbation  calculation  using  a transmission  line  analog.  Calculations 

4 

based  upon  the  full  solution  to  the  vector  wave  equation  are  shown  to  agree  with  the 
approximate  calculations. 

The  deleterious  effects  of  waveguide  thickness  variations  in  the  coupling  region 

increases  with  increasing  mode  number  for  both  the  grating  and  the  prism  couplers. 

For  the  25  p thick  wafer  waveguide  used  in  the  early  thin-film  modulator,  a thickness 

variation  of  about  8 p,  over  the  input  beam  will  reduce  the  input  coupling  for  the  TE^ 

mode  by  a factor  of  2;  for  TE^  and  TE^  modes,  thickness  variations  of  1.3  p and  0.  7 p, 

respectively,  will  have  the  same  deleterious  effect.  Because  the  coupling  is  relati-'’ely 

weak  for  the  modulator  waveguide,  higher -order  modes  must  be  used  for  efficient 

2 

coupling.  Thickness  variations  on  the  order  of  1 ji  over  a 1 to  2mm  region  of  the  in- 
put coupler  were  typically  observed  in  the  free-standing  modulator  waveguides. 

Other  structural  imperfections  that  are  particular  to  the  grating  coupler  can  de- 
grade coupling  efficiency.  These  imperfections  include  groove  depth  variations,  var- 
iations in  the  profile  of  the  grooves,  and  surface  roughness  which  will  cause  scattering 
losses.  The  first  two  effects  are  treated  in  terms  of  variations  in  both  effective  wave- 
guide thickness  and  the  coupling  parameter,  and  the  last  effect  is  calculated  by  using 
the  results  of  an  approximate  diffraction  model.  Groove  depth  variations  equal  to  half 
the  average  groove  depth  are  considered.  It  is  shown  that  for  the  modulator  waveguide, 
the  deleterious  effects  of  groove  geometry  variations  are  minimized  when  the  groove 
depth  is  approximately  equal  to  the  spatial  extent  of  the  evanescent  field  in  the  air 
region. 

The  partition  of  energy  that  results  from  multiple  beams  is  analyzed  for  the  grat- 
ing coupling  on  a symmetric  modulator  waveguide.  Because  the  lower  diffracted  beam 
can  carry  a significant  amount  of  energy  at  both  the  input  and  the  output  gratings,  the 
optical  losses  from  multiple  beam  must  be  considered  in  the  design  of  couplers.  Var- 
iations in  the  partition  of  energy  with  the  thickness  of  the  coupler  region  are  calculated, 
and  the  effects  of  such  thickness  variations  on  the  intensity  distribution  of  the  output 
beam  are  discussed. 

Experimental  data  with  a free-standing  waveguide  are  shown  which  indicate  that 
structural  imperfections  in  the  coupler  regions  can  be  sufficiently  large  to  influence 
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total  coupling  efficiency.  These  data  include  direct  measurements  of  thickness  varia- 
tions in  the  waveguide,  measurements  of  beam  steering  within  the  waveguide  that  occurs 
as  a consequence  of  transverse  thickness  variations  in  the  guide,  and  measurements 
of  the  intensity  distribution  of  the  output  beam  as  a function  of  mode  number.  A direct 
comparison  between  coupling  with  a grating  and  coupling  with  a prism  into  the  same 
waveguide  shows  that,  for  the  modulator  waveguide,  the  coupling  efficiencies  attainable 
with  the  prism  are  nearly  an  order -of-magnitude  higher  than  those  attainable  with  the 
grating.  This  striking  difference  is  interpreted  in  terms  of  the  analysis  of  structural 
imperfections . 

This  work  was  carried  out  as  a joint  effort  between  groups  at  the  Polytechnic  and 
at  United  Technologies  Research  Center. 
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RF-SPUTTERING  OF  ALUMINUM -OXIDE  FILMS  FOR  INTEGRATED  OPTICS 
APPLICATIONS 


L.  Bergstein,  E-W.  Hu  and  M.  Eschwei 


Among  all  the  existing  vacuum  deposition  techniques,  it  seems  that  sputtering  is 
one  of  the  most  attractive  methods  for  the  preparation  of  low-loss  films  for  integrated 
optics  applications.  Despite  the  complex  plasma  processes  associated  with  sputtering, 
the  important  sputtering  parameters  can  be  accurately  controlled  to  yield  reproducible 
results.  Moreover,  the  deposition  parameters  such  as  gas  content  and  sputtering 
powers  can  be  readily  adjusted  over  a wide  range.  This,  combined  with  a relatively 
slow  deposition  rate,  offers  the  possibility  of  deposition  of  films  of  variable  refractive 
index.  It  is  this  last  aspect  which  was  the  main  motivation  for  the  reported  investiga- 
tion of  sputtered  aluminum -oxide  films. 


Numerous  works  on  thin  films  fabricated  by  sputtering  techniques  have  been  re- 
ported in  the  literature.  We  report  in  this  work  a systematic  study  of  rf-sputtered 
aluminum -oxide  films,  emphasizing  their  potential  applications  in  integrated  optics. 

The  choice  of  AI2OJ  in  this  work  is  based  on  the  following.  First,  Al^O^  seems  to  be 
a very  attractive  material  for  its  proved  versatile  applications  in  conventional  thin 
film  optical  devices  area.  It  has  been  reported  that  thermally  evaporated  ^^2^3  films 
are  mechanically  and  chemically  stable  and  transparent  over  a wide  spectral  range. 
Second,  rf-sputtered  Al^O^  films  have  been  extensively  investigated  for  microelectronic 
applications,  mainly  for  their  superior  performance  over  SiO^  films  as  a effective  mask 
against  dopants,  alkaline  ion  diffusions , radiation  damages,  etc.l  To  our  knowledge, 
no  systematic  and  successful  work  on  rf-sputtered  Al^O^  film  for  integrated  optics 
application  has  yet  been  reported.  Efforts  were  made  during  the  past  year  in  this 
laboratory  to  obtain  optical  low-loss  and  reproducible  rf-sputtered  Al^O^  films.  Using 
oxygen  and/or  argon,  only  moderate  success  was  achieved.  However,  despite  the  lack 
of  reproducibility  and  inability  to  fabricate  very  low  loss  films,  the  results  were  en- 
couraging in  the  sense  that  the  refractive  index  of  the  films  did  seem  to  vary  with  the 
sputtering  parameters  over  a rather  wide  range.^  We  therefore  decided  to  investigate 
the  problem  in  a more  detailed  and  systematic  way. 

We  performed  a series  of  runs  using  nitrogen  and  argon  separately  and  in  com- 
bination as  the  sputtering  ion  source.  Since  the  sputtering  processes  take  place  at 
elevated  temperatures,  it  was  expected  that  the  inclusion  of  oxygen  would  prevent  the 
concentration  of  free  aluminum  (disassociated  from  the  Al^O^  target  at  elevated  tem- 
peratures) and  thus  enhance  the  quality  of  the  oxide  films.  Contrary  to  our  expectations, 

the  results  show  that  inclusion  of  oxygen  in  the  sputtering  chamber  degrades  the  film. 

•2 

This  is  also  indicated  by  Deitch  et  al.  , who  found  that  oxygen  is  not  the  best  atmosphere 
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for  the  deposition  of  oxides  by  sputtering.  Similar  observations  were  also  reported 
recently"*  on  some  other  oxide  films  prepared  by  dc -sputtering . Using  only  nitrogen 
and/or  argon,  we  were  able  to  obtain  optical  Al^O^  films  of  excellent  quality.  More 
importantly,  by  varying  the  sputtering  parameters,  we  were  able  to  var  f the  refractive 
indices  of  these  films  over  a very  wide  range,  from  a low  value  of  1 . 48  at  6,328  Ang- 
stroms to  a high  value  of  1.67. 


Experimental  Results 


The  experiment  was  carried  out  in  a new  Perkin-Elmer  model  2400  sputtering 
system  with  a well  trapped  six-inch  oil  diffusion  pump  capable  of  yielding  a vacuum 
better  than  5x10'  torr.  Ultrahigh  purity  sputtering  gases  such  as  N^,  and  Ar  were 
admitted  separately  into  the  vacuum  chamber  and  the  pressures  (of  up  to  40  microns) 
were  accurately  controlled  and  monitored.  The  hot-pressed  AI2OJ  target  had  a purity 
of  99.995%.  Standard  microscope  slide  glasses  with  refractive  index  of  1. 5154  at 
6328 were  used  as  our  basic  substrates  for  most  of  the  experiments.  In  order  to  ob- 
tain guided  wave  modes,  fused  quartz  plates  with  a refractive  index  of  1.4567  were 
occasionally  necessary  when  the  refractive  index  of  the  films  was  lower  than  that  of 
the  microscope  slides. 


Immediately  after  proper  cleaning,  we  transferred  the  substrates  into  the  vacuum 
chamber  and  evacuated  the  system.  The  target  was  then  pre-sputtered  for  at  least  40 
minutes  in  order  to  remove  possible  target  surface  contaminations.  During  the  deposi- 
tion, both  the  target  and  the  base  plate  were  water  cooled.  However,  after  about  ten 
minutes  the  temperature  of  the  substrates  soon  rose  to  and  stabilized  at  about  200  to 
350°C  for  sputtering  powers  of  100  to  500  Watts.  A series  of  runs  were  performed  by 
systematically  changing  the  sputtering  gas  content  and  the  sputtering  rf-powers. 


Our  primary  interest  is  to  obtain  low-loss  films  and  to  determine  the  dependence 
of  the  refractive  index  on  the  sputtering  parameters.  The  prism  coupler  method  was 
used  for  measuring  the  refractive  index.  Typical  film  thickness  was  about  two  microns 
so  that  several  modes  were  guided.  The  measurements  for  both  TE  and  TM  modes 
were  carried  out  and  the  results  showed  no  difference  between  the  two  cases.  This 
assured  the  accuracy  of  the  measurements  and  ruled  out  the  possibility  that  the  films 
are  birefringent.  Relative  loss  measurements  were  performed  using  the  same  setup 
described  in  Reference  2. 


Figures  1 to  3 show  some  typical  plots  of  the  refractive  indices  of  the  films  versus 
rf-sputtering  powers  with  sputtering  gas  content  (N2  and/or  Ar)  as  a parameter.  (Since 
most  films  obtained  with  were  lossy,  they  are  not  included  in  our  results. ) Figures 
4 to  6 show  the  corresponding  rates  of  deposition  as  a function  of  sputtering  powers. 
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Refractive  index  versus  r£- sputtering  Fig.  4.  Rate  of  deposition  versus  rf- sputtering 

power  with  gas  pressure  as  a varying  power  with  gas  pressure  as  a varying 

parameter;  sputtering  gas  is  argon.  parameter;  sputtering  gas  is  nitrogen. 
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Table  I gives  a convenient  insight  into  the  loss  figures  as  a function  of  both  rf-sputter- 
ing  powers  and  sputtering  gas  content.  It  shows  that  the  films  are  of  low-loss  with 
intermediate  range  (about  300  Watts)  of  applied  rf-powers,  independent  of  the  sputter- 
ing gas  content  {N2  and/or  Ar).  The  films  sputtered  with  high  powers  (—  500  Watts) 
are  generally  good  for  nitrogen  rich  sputtering  atmosphere  but  become  poorer  with  in- 
creasing argon  gas  content.  At  low  sputtering  powers  (a  100  Watts),  the  films  are  in 
general  lossy.  All  the  films  are  mechanically  hard  and  durable,  as  indicated  by  the 
fact  that  the  refractive  indices  remain  unchanged  after  the  films  have  been  exposed  to 
the  atmosphere  for  an  extended  period  of  time  (a  few  months). 

TABLE  I.  Relative  optical  losses  of  the  films  as  functions  of 
the  rf-sputtering  power  and  sputtering  gas  content. 
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Relative  Losses:  P = poor  (>  lOdB/cm),  F = fair  (5  to  lOdB/cm), 

G = good  (2  to  5dB/cm),  E = excellent  {<  1 dB/cm). 

We  note  from  Figs.  1 to  3 that  the  refractive  indices  of  the  films  depend  strongly 
on  the  deposition  parameters  and  can  be  varied  over  a relatively  wide  range,  from  about 
1.48  to  1.67.  Moreover,  using  pure  nitrogen  as  sputtering  gas,  the  refractive  indices 
of  the  films  are  much  more  sensitive  to  the  rf-power  change  than  those  using  pure  argon 
as  sputtering  gas.  This  is  possibly  because  is  more  readily  trapped  in  the  film  than 
Ar.  Efforts  were  made  to  relate  the  refractive  indices  to  the  film  density.  However, 
the  results  were  not  too  conclusive  and  are  not  reported  here.  Some  X-ray  diffracto- 
meter and  Laue-back  reflection  measurements  were  performed  on  a number  of  samples. 
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No  crystalline  structure  was  detected.  Further  work  on  the  atomic  level  is  probably 
necessary  in  order  to  gain  a better  understanding  of  the  film  structure  and  the  sputter- 
ing processes. 

It  should  be  pointed  out  that  the  optical  film  losses  can  most  probably  be  further 
reduced  by  using  optically  polished  substrates,  and  a liquid  nitrogen  cold-trap  to  re- 
duce diffusion  pump  back  streaming. 

n.  Summary 


We  have  successfully  fabricated  low-loss  Al^O^  films  with  a wide  range  variation 
of  refractive  indices.  This  attractive  feature,  along  with  the  mechanical  and  chemical 
durability  of  the  films,  makes  the  Al^O^  film  a potential  material  for  many  optics  and 
integrated  optics  applications.  Examples  are  multilayer  spectral  filters  and  polarizers, 
polarization  independent  optical  waveguides  strip-line  waveguiding  structures,  etc. 

Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  L.  Bergstein,  E-W.  Hu  and  M.  Eschwei 

REFERENCES 

1.  T.N.  Kennedy,  "Evaluating  RF  Sputtered  AI2O3  for  Microcircuit  Fabrication," 
Electronic  Packaging  and  Production,  136-141  (December  1974). 

2.  F.  T.  Stone  and  M.  Eschwei,  "Optical  Properties  of  Sputtered  Aluminum  Oxides 
(AI2O3)  Films  for  Application  to  Integrated  Optics,  " Progress  Report  No.  39  to 
JSTAC,  Polytech.  Inst,  of  New  York,  Report  No.  R-452.  39-74,  1 71  - 1 75  ( 1 974). 

3.  R.  H.  Deitch,  E.J.  West,  T.G.  Giallorenzi  and  J.F.  Weller,  "Sputtered  Thin  Films 
for  Integrated  Optics , " Appl.  Optics,  T3,  712-715  (1974). 

4.  S.J.  Ingrey,  W.D.  Westwood,  Y.C.  Cheng  and  J.  Wei,  "Variable  Refractive  Index 
and  Birefringent  Waveguides  by  Sputtering  Tantalum  in  O2-N2  Mixtures,  " Appl. 
Optics,  ]_4,  2194-2198  (1975). 

5.  P.  K.  Tien,  R.  Ulrich  and  R.  J.  Martin,  "Modes  of  Propagating  Light  Waves  in  Thin 
Deposited  Semiconductor  Films,  " Appl.  Phys.  Lett.,  J_4,  291-294  (1969). 

6.  L.  Bergstein,  "A  Polarization -Independent  Optical  Waveguide,  " Progress  Report 
No.  36  to  JSTAC,  Polytech.  Inst,  of  New  York,  Report  No.  R-452.  36-71,  185-193 
(1971). 


116 


QUANTUM  ELECTRONICS 


COMPOSITION  AND  EQUILIBRIUM  TOTAL  PRESSURE  OF  BISMUTH  VAPOR 
W.T.  Walter 

1 2 

During  our  efforts  ’ to  demonstrate  efficient,  pulsed  laser  action  in  the  vapor 
of  bismuth,  it  became  apparent  that  a serious  disagreement  exists  in  the  characteriza- 
tion of  bismuth  vapor  as  reported  in  several  compilations.  A large  disparity  exists 

3 

between  the  most  recent  compilation  of  Nesmeyanov  and  the  prior  ones  of  Hultgren  et 
4 5 

al.  and  Stull  and  Sinke.  The  disparity  in  dimer  percentage  is  displayed  in  Fig.  1 by 

4 

the  three  solid  horizontal  lines.  The  upper  line  line  through  Hultgren 's  points  indicates 
a dimer  concentration  of  50  to  60%  in  the  pressure  range  of  1 mtorr  to  10  torr.  The 

3 

middle  line  through  Nesmeyanov's  points  indicates  a dimer  concentration  of  about  20%, 

5 

while  the  lower  line  through  the  points  of  Stull  and  Sinke  indicates  a dimer  concentra- 
tion of  approximately  10%. 
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BIS(i4UTH  vapor  PRESSURE  (Torr) 


Fig.  1.  Composition  of  bismuth  vapor  --  the  percentage 
of  dimers  in  the  vapor  as  a function  of  the  total 
equilibrium  vapor  pressure  of  bismuth. 
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The  disparity  in  total  bismuth  vapor  pressure  is  shown  in  Figure  2.  The  upper 

3 

line  through  Nesmeyanov's  points  indicates  a total  bismuth  vapor  pressure  about  ten 

4 5 

times  higher  than  either  the  compilation  of  Huitgren  et  al.  or  Stull  and  Sinke  which 
are  represented  by  the  lower  line. 


Fig.  2.  Total  equilibrium  vapor  pressure  of  bismuth  as  a 

function  of  the  reciprocal  of  the  absolute  temperature. 

To  resolve  these  disparities,  the  early  measurements  were  reexamined  and  com- 
pared with  several  recent  measurements.  On  this  basis,  as  described  below,  we  con- 

4 

elude  that  the  compilation  of  Huitgren  et  al.  best  represents  the  actual  situation  in 
bismuth. 
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In  1928  Leu^  carried  out  a Stern-Gerlach  experiment  using  bismuth  vapor  and  in 
1931  Zartman  used  a velocity  analyzer  on  bismuth  vapor.  Both  found  that  the  vapor 
of  bismuth  consisted  of  about  50%  Bi_  molecules  at  total  vapor  pressures  of  0.  1 to  1 torr. 
Ko  corrected  and  extended  these  initial  experiments  and  reported  dimer  concentrations 

Q 

of  60-70%  at  total  vapor  pressures  of  0.  1 to  I torr.  In  1941  Yosiyama  using  a torsion - 

effusion  method  obtained  dimer  compositions  of  the  bismuth  vapor  of  40-50%  at  1 to 

lOmtorr.  These  experiments,  indicating  a substantial  fraction  of  dimers  in  the  vapor 

3 5 

of  bismutli  were  not  used  in  the  compilations  of  Nesmeyanov  or  Stull  and  Sinke. 

More  recently,  the  torsion-effusion  experiments  of  Aldred  and  Pratt^^  and  Kim 

11  12 
and  Cosgarea,  the  quasi-static  and  boiling  point  experiments  of  Fischer  and  the  mass 

13  14 

spectrometric  experiments  of  Kohl,  U1  and  Carlson  and  Rovner  et  al.  all  confirm 
the  earlier  work  indicating  that  dimers  constitute  about  half  of  the  vapor  of  bismuth. 
These  more  recent  experimental  data  points  are  plotted  in  Fig.  1 along  with  the  earlier 

4 

measurements.  Hultgren's  compilation  is  in  reasonable  agreement  with  all  of  the  data 

3 5 

in  Fig.  1 while  the  compilations  of  Nesmeyanov  and  Stull  and  Sinke  are  not  and  must 
be  rejected.  The  mass  spectroscopic  study  of  Kohl,  U1  and  Carlson^^  appears  to  be  the 
most  reliable  of  all  the  vapor  composition  experiments.  It  is  in  good  agreement  with 
the  earlier  Hultgren  compilation  within  its  measurement  range  (0.  3-42  mtorr)  as  indicat- 
ed by  the  dotted  line  in  Figure  I.  This  experiment  also  confirmed  the  presence  of  a 
small  amount  of  Bi^  molecules  (approximately  1%)  in  this  pressure  region. 

In  Fig.  2 where  the  total  bismuth  vapor  pressure  is  plotted  as  a function  of  the 

reciprocal  of  the  temperature,  data  points  of  the  early  velocity  analyzer  measurements 

B 11 

of  Ko,  and  the  more  recent  torsion-effusion  experiments  of  Kim  and  Cosgarea  and 

1 3 

the  mass  spectroscopic  measurements  of  Kohl,  U1  and  Carlson  are  displayed  in  ad- 
dition to  points  from  the  three  compilations . Also  plotted  are  the  recent  heat-pipe 

1 5 

measurements  of  Schins  et  al.  at  high  bismuth  vapor  pressures  (>  49  torr).  All  of  this 

4 5 

data  is  in  substantial  agreement  with  the  compilations  of  Hultgren  and  Stull  and  Sinke 

3 

and  in  serious  disagreement  with  that  of  Nesmeyanov. 

Our  conclusion  is  that  the  compilation  of  Hultgren  et  al.^  is  the  best  characteriza- 
tion of  both  the  total  bismuth  vapor  pressure  as  well  as  the  composition  of  the  vapor  in 

terms  of  monomers  and  dimers.  Therefore  the  values  of  Hultgren  have  been  utilized 

2 

in  our  double-pulse  discharge  examination  of  bismuth  vapor  for  efficient,  pulsed  laser 
action  at  4722  X. 
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DOUBLE-PULSE  DISCHARGE  EXAMINATION  OF  BISMUTH  VAPOR  FOR  EFFICIENT, 
PULSED  LASER  ACTION  AT  4722  A 
W.T.  Walter  and  K.  Park 


The  kinetics  of  electron  collisions  in  pulsed  electrical  discharges  in  atomic  va- 
pors with  suitable  energy  level  structures  can  produce  efficient,  pulsed  laser  action. 
This  has  been  demonstrated  in  the  vapors  of  Pb,^  Mn,^  Cu,"^  Au,^  Ca,^  Sr,^  and  Ba.^ 

Most  experimental  effort  thus  far  has  been  applied  to  the  copper  vapor  laser.  Peak 

o 

powers  of  170  kW  and  average  powers  of  1 5 W have  been  generated  at  5 105  A in  copper 

g 

vapor  with  a 1%  overall  electrical  efficiency. 

Similar  laser  action  is  possible  in  other  elements  as  indicated  in  Table  IV  of 

o 

Reference  1.  The  possibility  at  4722  A in  bismuth  is  of  particular  interest  because  it 

would  be  close  to  the  wavelength  of  maximum  transmission  through  water . However, 

in  spite  of  attempts  in  a number  of  different  laboratories,  laser  action  has  not,  to  our 

o 

knowledge,  been  observed  at  4722  A in  the  vapor  of  bismuth. 


The  vapor  of  bismuth  contains  a substantial  percentage  of  dimers  while  the  vapors 
of  copper  and  lead  do  not.  Figure  1 compares  the  dimer  concentration  in  the  vapors  of 
bismuth,^’  copper^  ^ and  lead^^  in  the  pressure  range  0.  1 mtorr  to  100  torr.  This  is 
a broader  pressure  range  by  a factor  of  at  least  ten  at  each  end,  than  the  copper  and 
lead  vapor  lasers  have  operated  in  thus  far.  The  dimer  concentration  in  bismuth  vapor 
is  greater  than  50%  at  pressures  below  10  torr  while  the  dimer  concentration  in  copper 
vapor  is  less  than  1%  and  in  lead  vapor  less  than  0.  1%  in  a similar  pressure  region. 

At  1 torr  of  total  metallic  vapor  pressure,  for  example,  where  operation  of  both  the 
copper  and  lead  vapor  lasers  is  very  strong,  the  dimer  concentration  in  bismuth  vapor 
is  53%  while  it  is  only  0.  4%  in  copper  vapor  and  0.  06%  in  the  vapor  of  lead. 


The  presence  of  bismuth  dimers  could  adversely  affect  the  possibility  of  laser 
0 

action  at  4722  A in  several  ways: 

C 

(1)  directly  by  absorption  at  4722  A 

o 

(2)  by  absorption  of  the  3067  A bismuth  resonance  line  which  will  not  only  reduce 

radiation  trapping  but  can  produce  dissociation  of  the  Bi2  dimers  leaving  one 
Bi  atom  in  the  6p-^  rnetastable  proposed  lower  laser  level, 12,  13 

(3)  by  means  of  collisional  processes  which  interfere  with  the  excitation  process, 
such  as  a modification  of  the  electron  distribution,  or  which  quench  the  upper 
resonance  level. 

To  determine  whether  the  presence  of  dimers  in  the  vapor  of  bismuth  could  be  the 

o 

cause  for  non-observance  of  laser  action  at  4722  A,  we  experimentally  examined  two 
methods  of  dissociating  the  dimers  and  tested  the  resulting  vapor  for  laser  action: 

1)  thermal  dissociation  and  2)  discharge  dissociation. 
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VAPOR  PRESSURE  OF  THE  METAL  (torr) 

Fig,  1.  Comparison  of  the  mole  fraction  of  dimers  in  the 
vapors  of  bismuth,  copper  and  lead. 

1 4 

The  split  heat-pipe  apparatus  previously  described  was  used  to  create  two  dis- 
tinct temperature  zones  in  a furnace  and  thermally  dissociate  the  Bi^  molecules  in  the 
hotter  central  zone  as  indicated  in  Table  I.  The  temperature  of  the  split  heat-pipe  end 
zones  was  held  at  900°C  so  that  the  pressure  throughout  the  apparatus  remained  con- 
stant at  1 torr.  The  values  calculated  in  Table  I are  based  on  Hultgren's  values  both 
of  dimer  concentration  and  of  total  bismuth  vapor  pressure  according  to  our  critical 

Q 

evaluation  of  the  experimental  data. 

When  thermal  dissociation  was  experimentally  tested,  establishment  of  two  dis- 
tinct temperature  zones  was  evident.  The  temperature  along  the  6”  long  tungsten-mesh 
split  heat-pipe  was  examined  by  means  of  an  optical  pyrometer.  The  temperature  in- 
dicated along  the  inside  of  the  mesh  tube  was  viniform  confirming  the  wicking  action  of 

liqtiid  bismuth.  The  split  heat-pipe  also  served  as  electrodes  for  the  electric  discharge. 

o 

No  evidence  of  laser  action  at  4722  A was  observed,  however. 

A double-pulse  discharge  system  was  then  utilized  to  dissociate  the  bismuth 
molecules  --  the  first  pulse  to  dissociate  the  Bi2  dimers  and  the  second  to  excite  the 
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TABLE  I.  Thermal  dissociation  of  bismuth  dimers  in  the 
split  heat-pipe  apparatus.  The  split  heat-pipe 
end  zones  are  held  at  900°C  corresponding  to  a 
total,  equilibrium,  bismuth  vapor  pressure  of 
1 torr  and  the  temperature  of  the  central  zone 
is  increased. 


^1 

Tz 

Dimer 

Concentration 

900°C 

900°C 

53% 

900°C 

1100°C 

11% 

900°C 

1300°C 

1.4% 

900°C 

1500°C 

0.3% 

Bi  atoms.  This  method  was  first  demonstrated  by  Chen,  Nerheim  and  Russell^^  to  ob- 
tain laser  action  in  atomic  copper  using  copper  chloride  vapor. 

The  hot-window  discharge  system  utilized  for  these  double-pulse  experiments  is 
shown  in  Figure  2.  Electrodes,  a valve  and  quartz  window-to-tube  seals  all  capable 
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Fig.  2.  Hot-window  quartz-tube  laser-discharge  apparatus  capable 
of  sustained  operation  at  temperatures  up  to  1000°C  for 
copper  halide  and  bismuth  dimer  vapors. 
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of  withstanding  temperatures  as  high  as  1000°C  were  developed  to  construct  a system 
in  which  the  entire  laser  tube  can  be  operated  at  temperatures  up  to  the  1000°C  de- 
vitrification limit  of  quartz.  The  entire  laser  discharge  tube  can  operate  then,  under 
equilibrium  temperature  and  vapor  density  conditions;  the  the  operational  lifetime  is 
no  longer  limited  by  migration  of  the  active  material  out  of  the  hot  zone. 

The  quartz  laser  discharge  tube  was  connected  to  a sidearm,  as  indicated  in  Fig. 
2,  which  contained  0.2gms  of  copper  chloride  and  1.25gms  of  bismuth  shot.  The  side- 
arm  was  placed  in  a second  furnace  to  independently  control  its  temperature  and  there- 
by the  vapor  pressure  of  its  contents.  The  optical  cavity  consisted  of  an  aluminized 
concave  mirror  (172.  4cm  radius)  and  a flat  beamsplitter  (63%  transmission)  separated 
by  120  cm. 

The  double-pulse  discharge  system  is  shown  in  Figure  3.  Two  Sprague  3900pF 


HIGH 


Fig.  3.  Double -pulse  excitation  system  for 

copper  halide  and  bismuth  dimer  vapors. 

ceramic  capacitators  charged  to  12kV  were  independently  discharged  through  the  laser 
tube  by  separate  hydrogen  thyratrons.  The  pulse  generators  provide  an  adjustable  de- 
lay time  between  the  two  pulses  as  well  as  an  adjustable  repetition  rate  of  the  double - 
pulse  signal. 

The  small  amount  of  copper  chloride  was  included  in  the  sidearm  along  with  bis- 
muth to  demonstrate  the  efficacy  of  double-pulse  excitation  in  our  experimental  system. 
When  the  temperature  of  the  main  and  sidearm  furnaces  was  raised  to  500  C,  laser 
action  was  observed  on  the  5105A  and  5782A  atomic  copper  transitions.  Optimum  delay 


124 


QUANTUM  ELECTRONICS 


time  between  pulses  was  28-30  psec.  The  pressure  range  in  which  the  laser  would  op- 
erate was  3-5  torr  of  argon  with  the  optimum  being  3.  8 torr.  Peak  current  amplitudes 

were  13  and  15  amperes  respectively  for  the  two  pulses,  corresponding  to  peak  current 

2 

densities  of  400  to  500  A/cm  . The  copper  vapor  laser  action  was  also  used  to  optimize 
the  optical  cavity  alignment. 

When  the  temperature  of  the  furnaces  were  raised  above  700°C,  the  copper  chlo- 
ride diffused  through  the  open  quartz  ball  and  socket  valve  and  bismuth  took  over  the 
discharge.  The  furnace  temperatures  were  gradually  taken  as  high  as  970°C  corre- 
sponding to  an  equilibrium  total  bismuth  vapor  pressure  of  2.7  torr.  Helium  and  argon 

were  used  as  buffer  gases  and  tested  throughout  a pressure  range  of  0-500  torr.  The 

2 

peak  current  amplitudes  were  operated  up  to  20  amp -res  (600  A/cm  ).  The  time  delay 
between  pulses  was  usually  set  at  30psec.  When  the  delay  was  varied,  the  output  light 
signal  at  4722  X produced  by  the  second  pulse  never  increased  from  its  value  at  30psec. 
The  output  light  pulses  were  examined  by  means  of  a Jarrell-Ash  0.  5m  spectrometer 

o 

which  was  usually  set  at  4722  A.  An  iris  in  front  of  the  spectrometer  was  adjusted  to 
narrow  the  field  of  view  to  that  of  the  back  mirror  of  the  laser  resonator  as  viewed 
through  the  2mm  i.d.  di.=charge  tube. 

o 

No  evidence  of  laser  action  at  4722  A was  observed  during  any  variation  of  experi- 

o 

mental  parameters.  The  amplitude  of  the  output  light  signal  at  4722A  produced  by  the 

second  discharge  pulse  was  usually  equal  to  or  less  than  that  produced  by  the  first 

pulse.  It  was  never  more  than  20%  larger  than  the  first.  Even  more  disheartening  was 

the  observation  that  the  output  light  pulses  did  not  change  when  the  back  mirror  was 

o 

blocked.  This  indicates  that  the  discharge  was  not  transparent  at  4722  A. 


Both  the  thermal  dissociation  and  discharge  dissociation  methods  were  successful, 

we  believe,  in  substantially  reducing  the  proportion  of  dimers  in  the  vapor  of  bismuth. 

o 

The  absence  of  enhancement  of  light  output  signal  at  4722  A and  the  lack  of  transparency 

of  the  discharge  at  the  proposed  laser  transition  in  spite  of  an  estimated  reduction  in 

dimer  mole  fraction  in  the  vapor  by  100  suggests  that  other  processes  are  more  impor- 

2 

tant  in  populating  the  Proposed  lower  laser  level.  We  conclude,  therefore,  that 

the  presence  of  dimers  in  the  vapor  of  bismuth  is  not  the  dominant  explanation  for  the 

o 

absence  of  laser  action  at  4722  A. 
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SPONTANEOUS -EMISSION  TRANSITION  PROBABILITY  OF  THE  4722  A ATOMIC 
BISMUTH  LINE 

W.T.  Walter  and  N.  Solimene 

One  of  the  criteria  for  efficient,  pulsed  atomic -vapor  lasers^  is  that  the  risetime 
of  the  excitation  current  pulse  must  be  faster  than  the  reciprocal  of  the  transition  prob- 
ability, A^^  , of  the  proposed  laser  line  for  laser  action  to  be  achieved  in  a three  level 
energy  level  structure  as  indicated  in  Figure  1.  If  this  criterion  is  not  satisfied,  spon- 
taneous radiation  will  drain  the  upper  laser  level  and  fill  the  metastable  lower  laser 
level,  thereby  preventing  the  establishment  of  a population  inversion.  The  total  upper 
level  lifetime  (A^^  + -^^g)  ^ '^uf  ’ resonance  radiation  is  well  trapped  at 

the  operating  temperature  for  pulsed  laser  action.  For  example,  in  a 1 cm  diam  tube 

at  1100°K  containing  0.  1 torr  of  atomic  bismuth  vapor  pressure,^’ ^ the  radiation -trapped 

4 ° B 

transition  probability  of  the  3067  A resonance  line  has  been  reduced  from  2x  10  to 

5 - 1 

1x10  sec  ; and  the  atomic  bismuth  vapor  presence  for  optimum  peak  laser  output  is 
expected  to  be  > 0.  1 torr. 
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Fig.  1.  Efficient  pulsed  gas -discharge 
laser  in  an  atomic  vapor. 

5 6 ° 

In  our  examination  ’ of  efficient,  pulsed  laser  action  at  47  22  A in  the  atomic  va- 

O 

por  of  bismuth,  the  only  measurement  we  have  found  for  the  47  22  A spontaneous -emis - 
sion  transition  lifetime  is  the  111  nsec  value  of  Corliss  and  Bozman.^  Substantial  errors. 


some  as  large  as  a factor  of  20,  have  been  reported  in  the  values  of  transition  probabil- 
ities determined  by  Corliss  and  Bozman^  from  the  intensity  measurements  of  Meggers, 
Corliss  and  Scribner.  Therefore  one  is  reluctant  to  rely  very  strongly  on  these  values. 
Because  accurate  values  of  atomic  transition  probabilities  are  important  in  the  evaluation 

9 

of  new  laser  systems,  we  developed  a procedure  to  obtain  improved  values  of  the 

Corliss -Bozman  transition  probabilities.  When  Fig.  1 of  Ref.  9 is  used  with  the  bismuth 

- 1 3 ° 

parameters  (E  = 325B8cm  , I = 60x  10  , X = 4722A  and  U = 4.3),  an  improved  Corliss- 

o 

Bozman  transition  lifetime  of  140nsec  is  obtained  for  the  4722A  line. 


Although  apparently  no  other  direct  measurements  have  been  carried  out  on  the 

O 0 

4722  A bismuth  line,  a number  of  measurements  have  been  made  on  the  3067  A resonance 
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line  and  of  the  lifetime  of  the  6p^  7s  level  which  is  the  proposed  upper 

laser  level.  These  measurements,  which  are  listed  in  Table  I,  can  be  used  with  the 

o 

branching  ratio  to  obtain  a better  value  for  the  transition  lifetime  of  the  4722  A bismuth 
line . 

TABLE  I,  Lifetime  of  6p^7s  bismuth  resonance  level. 


Measurement  Method  Lifetime  (nsec) 


7 

Corliss  and  Bozman 

Emission 

2.8 

T 10 

Lvov 

Absorption 

8.9  ± 1.8* 

Rice  and  Rag  one 

+ 2.3* 

Absorption 

5.3  - 1.7 

12 

Cunningham  and  Link 

Phase  Shift 

5.9  ±0.2 

Svanberg^^ 

Hanle  Effect 

4.75  ± 0.  18 

Anderson,  et  al.^"^ 

Beam  Foil 

4.7  ±1.0 

As  corrected  in  Ref.  12  by  using  the  Corliss -Bozman 
value  of  0.  97  for  the  branching  ratio  into  the  3067  A 
resonance  line. 


Intermediate  Coupling 
SCF  Calculation 

Type  of  Exchange 
Approximation 

F orm 

Lifetime 

Hohmgren^  ^ 

Hartree -Slater 

Dipole  Length 
Dipole  Velocity 

5.  7 nsec 
2.  6 nsec 

Holmgren^^ 

OHFS-Lingren- 

Dipole  Length 

4.  1 nsec 

Rosen 

Dipole  Velocity 

7.  1 nsec 

Kunisz  and  Migdalek^^ 

Lingren 

5. 7 nsec 

Table  I reveals  that  all  of  the  other  experimental  measurements  of  the  6p  7s  Pj^2 
resonance  level  lifetime  are  longer  than  Corliss  and  Bozman's  2.8  nsec  value.  The 
most  accurate  value  is  probably  Svanberg's  Hanle -effect  measurement  of  4.75  nsec. 

2 3 

Recently  intermediate  coupling  calculations  have  been  carried  out  for  6p  7s  -»  6p 
transitions  in  bismuth.^ The  calculated  lifetime  values  are  Usted  in  the  lower  part 
of  Table  I.  The  transition  probabilities  calculated  depend  on  the  local  exchange  approxi- 
mation as  well  as  on  the  dipole  length  or  dipole  velocity  forms  of  the  transition  proba- 
bility operator  since  the  self-consistent  field  wavefunctions  are  not  exact.  The  differ- 
ence between  the  dipole  length  and  dipole  velocity  calculations  can  be  viewed  as  an 
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indication  of  the  inexactness  of  the  wavefunctions . The  calculations  indicate  that  the 
2 4 

lifetime  of  the  6p  Is  Pj^2  bismuth  resonance  level  is  approximately  5 nsec  which  is 
consistent  with  the  best  experimental  values  and  is  substantially  longer  than  the  Corliss - 
Bozman  value. 

The  branching  ratio,  / E A^.  = A^^  T^,  may  be  used  to  obtain  a particular 

all  j 

transition  lifetime  from  the  level  lifetime,  The  only  experimental  values  available 

to  determine  branching  ratios  in  bismuth  are  from  Corliss  and  Bozman.^  These  ratios, 
however,  should  be  more  accurate  than  the  value  for  an  individual  transition  probability 
since  effects  other  than  radiation  trapping,  such  as  uncertainty  in  the  degree  of  ioniza- 
tion or  non -uniformity  of  the  arc,  are  expected  to  be  the  dominant  sources  of  error  and 
should  substantially  cancel  out  in  the  ratio. 

o 

Branching  ratios  into  the  4722  A bismuth  transition  are  also  available  from  the 
intermediate  coupling  calculations.^  ’ ^ These  values  are  in  substantial  agreement 
with  the  Corliss -Bozman  value  as  indicated  in  Table  II. 

o 

TABLE  II.  Branching  ratio  into  the  47  22  A bismuth  transition. 


Source 

Method 

Branching  Ratio 

Corliss  - Bozman^ 

Emission  Measurement 

0. 025 

Holmgren^  ^ 

Calculation  - OHFS  - 

dipole  length 

0.  020 

dipole  velocity 

0.  047 

Kunisz  -Migdalek^  ^ 

Calculation 

0.  026 

° 4 2 

Using  the  Corliss -Bozman  branching  ratio  of  .025  for  the  4722  A ^1/2  ^3/2 

we  may  deduce  a 190  nsec  transition  lifetime  from  Svanberg's  4.75  nsec 
lifetime.  This  is  substantially  longer  than  the  111  nsec  Cor lis s -Bozman  value  and  also 
longer  than  the  improved  value  of  140  nsec. 


o 

All  indications  are,  therefore,  that  the  transition  lifetime  of  the  4722  A proposed 

laser  transition  in  bismuth  is  significantly  greater  than  100  nsec;  the  best  value  being 

190  nsec.  This  corresponds  to  a spontaneous -emission  transition  probability  for  the 
o 6-1 

4722  A bismuth  line  of  5.  3x  10  sec  . Excitation  pulse  risetimes  in  our  various  metal 
vapor  discharge  tubes  have  been  50  to  100  nsec.  During  the  course  of  our  experimental 
investigation  of  pulsed  discharges  in  bismuth  vapor ^ we  have  produced  excitation 
current  risetimes  as  short  as  ~25  nsec  in  a radial  discharge  between  a wire  at  the 
center  and  a cylindrical  electrode  at  the  wall  of  the  discharge  tube.  Since  excitation 
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pulse  risetimes  as  short  as  ~25  nsec  have  been  produced  and  since  laser  action  has 
been  produced  on  transitions  with  similar  or  shorter  transition  lifetimes  in  Mn  and  Pb; 
it  appears  to  be  very  unlikely  that  an  insufficiently  fast  risetime  of  the  current  excita- 
tion pulse  can  be  the  explanation  for  the  absence  of  laser  action  at  4722  A in  bismuth 
vapor . 
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APPLICATION  OF  THE  HEISENBERG  PICTURE  TO  THE  AVERAGED  DESCRIPTION 
OF  THE  PROPAGATION  OF  OPTICAL  BEAMS 

M.C.  Newstein  and  D.  Ramakrishnan 


A large  class  of  problems  describing  the  propagation  of  optical  signals  is  describ- 
ed by  the  quasi -optic  equation 


i-K 


^ — + V S 


U) 


where  Sis  the  slowly  varying  complex  envelope  of  an  optical  signal,  V,j.  is  the  transverse 
laplacian  and  V is  proportional  to  the  susceptibility  of  the  medium,  ■%.  = X,/2it.  Formal 
integrals  such  as 

<z|F|z>  = ff  dxdy  S "(x,  y,  z)  F(x,y,p  ,p  ;z)  S(x,y;z)  (2) 

X y 

where 


P 


X 


-i-xa 

X 


-i-xa 

y 


may  be  used  as  concise  representations  of  important  field  characteristics. 

The  optics  problem  of  finding  the  dependence  on  z of  the  weighted  operator  func- 
tion < F>  defined  by  Eq.  (2)  is  analogous  to  the  quantum  mechanics  problem  of  deter- 
mining the  evolution,  in  time,  of  the  expectation  value  of  a dynamical  operator  F.  The 
states  of  a quantum  system  correspond  to  vectors,  j + > in  a Hilbert  space.  Dynamical 

variables,  such  as  position,  x,  and  momentum  p , correspond  to  linear  transformations 
1 ^ 

in  this  space. 


Alternative  pictures  are  available  for  the  interpretation  of  the  dynamical  evolution 
2 

of  the  wave  function.  In  the  Schrodinger  picture  the  state  vectors  are  visualized  as 
evolving  in  time  through  a fixed  coordinate  frame  determined  by  the  eigenvectors  of 
time  independent  operators.  The  quantum  expectation  value  of  Schrodinger  operator^ 
F(x,p^;t),  in  the  Schrodinger  state  l4)(t)>  is  given  in  the  x representation,  by 

<4i(t)  I F(x,  p^,  t)  l4j(t)  > = Jdxjj  (x,  t)  F(x, -i-R  t)  4j(x,  t)  (3) 


In  the  Heisenberg  picture  the  state  vectors  are  visualized  as  fixed  in  time,  but 
the  operators  and  their  eigen-vectors  are  time  dependent.  The  quantum  expectation 
value  of  the  Schrodinger  operator  F(x,p^;t)  in  the  state  represented  by  the  vector 
li|i(t)>  equal  to  the  quantum  expectation  value  of  the  Heisenberg  operator  F(x(t) , p^(t) ; t) 
in  the  initial  state  represented  by  the  vector  l4<(tQ)  > , i.  e.  , 

<4/(t)lF(x,p^;t)|  4j(t)>  = <4;(tp)|F(x(t),p^(t);t)|  i|i(tQ)>  (4) 


In  this  report  we  have  applied  the  Heisenberg  picture  to  a set  of  problems  includ- 
ing those  discussed  in  Reference  4. 

A.  Free  Space  Propagation 

The  simplest  example  of  the  method  is  its  application  to  the  propagation  of  optical 
beams  in  free  space,  where  the  potential  function  V is  a constant.  The  Heisenberg 
equations  of  motion  for  the  transverse  vector  become: 


d 1 r 

d^£=  Hi  L^*  2 J = 

dp  1 

af-^CE.vj.o 


The  corresponding  mechanical  equations  of  motion  are  the  same  as  the  classical  equa- 
tions of  motion  for  a particle  in  the  absence  of  any  applied  force.  The  operator  solution 
is  (taking  Zq  = 0), 

£(z)  = £(0)  + Z£(0) 

£(z)  = £(0) 

Thus,  for  the  weighted  average  of  the  operator  function  F(x,y,9  ,9  ; z)  we  have,  equat- 

X y 

ing  the  Schrodinger  picture  expression  to  the  Heisenberg  picture  result, 

f f dxdy  (?*(x,y;  z)  F(x,y,  z)  <S(x,y;  z)  = ff  dxdy  <?*(x,y;  0) 

F(x-i-Kz9^,  y - i \ z 9y,  9^,  9^;  z)  (S(x,y;0)  (7) 

For  the  specific  case  where  we  require  the  moment  r (0,z),  we  have 
^ mn 

JJ  dxdy  x'^y'^  | (S{x,  y;  z)  ] ^ = JJ  dxdy  (S*(x,  y;  0)  [x-i\z  [y  - i-K  z 9^]*^  S (x,y;  0) 

(8) 

which  is  a polynomial  in  z of  degree  m + n.  The  lower  moments  have  simple  physical 
interpretations,  e.g.  , for  m = n=  0 we  have  conservation  of  energy: 

JJdxdy  I (?(x,  y;  z)  I ^ = JJ  dxdy  | 5(x,  y ; 0)  | ^ (9) 

The  second  centrifugal  moment  JJdxdy(x^  + y^)  |(S{x,y,z)|^  is  a quadratic  function  of 

the  propagation  distance  z.  The  physical  significance  of  the  lower  moments  is  discus - 

4 5 

sed  by  Vlasov,  et  al.  and  Papoulis. 
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B.  Perturbation  Theory  --  Application  to  Propagation  Through  Random  Media 

The  Heisenberg  picture  is  particularly  suited  to  the  concise  representation  of  the 
results  of  perturbation  theory.  Consider  a wave  equation  of  the  form: 

where  Hq  and  Hj  represent  self  adjoint  operators.  We  wish  to  express  averaged  prop- 
erties of  the  field  developing  under  the  full  Hamiltonian  in  terms  of  expressions  involv- 
ing field  averages  under  the  unperturbed  Hamiltonian,  Hq.  The  results  of  perturbation 
theory  are  summarized  by  the  following  formal  expression. 


<F>jj  = 


z 2 

(exp  ^ Hj(z')dz')_  F(z)(exp  ^ Hj(z')dz')^>j^^ 


(11) 


The  subscript  H on  the  field  average  on  the  left  side  of  Eq.  (11)  indicates  that  the 
Heisenberg  operator  function  F(x(z).  y(z),  p^(z),  p^{z);z)  evolves  in  z under  the  full 
Hamiltonian  Hq  + Hj,  e.g.  , F satisfies  the  equation 

F = — If.  H + H.  + { evolution  under  H (12) 

dz  i-JC  □ ’ 0 ^ 3z  j 

The  subscript  Hq  on  the  field  average  on  the  right  side  of  Eq.  (11)  indicates  that  the 
operators  Hj  and  F evolve  in  z under  just  the  unperturbed  Hamiltonian  Hq; 


evolution  under  Hq 


(13) 


The  subscripts  + and  - on  the  exponentials  in  Eq.  (13)  are  the  time  ordering  symbols 
which  indicate  that  in  the  power  series  expansion  the  non-commuting  Hj  factors  are  to 
be  ordered  in  time  positively  (later  to  the  left)  and  negatively  (later  to  the  right)  respec- 
tively . 


We  now  apply  the  general  perturbation  theory  result,  Eq.  (II),  to  the  problem  of 
light  propagation  in  a medium  with  random  refractive  index  inhomogeneities.  We  as- 
sume the  potential  V(£,  z)  is  a real  random  function.  The  moments  of  the  mutual  co- 
herence function  may  be  conveniently  expressed  in  terms  of  the  generating  function 
G(K;£q;z),  where 


G(K;£q;z)  = < zl 


L£-  ^ 


K If  -£o  ■ £ I 


z> 


(14) 
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The  formal  perturbation  expansion  of  this  expression,  treating  the  term  p /2  as  the  un- 
perturbed Hamiltonian  is,  in  the  Heisenberg  picture: 


z 

^ / dz'V(£(z');z') 

).  (exp 

J dz' v(£(z^)  + £q+ (z'-z>K  K;z' 

^i£(z).K^^£o-P,o^ 


(15) 


Equation  (15)  is  a formally  exact  result,  but  approximation  methods  must  be  used  in  order 
to  obtain  practical  expressions  for  G.  The  direct  expansion  of  the  time  ordered  expon- 
ential operators  in  powers  of  V is  probably  not  as  useful  as  a procedure  which  is  capable 
of  preserving  the  exponential  form.  As  a direct  generalization  of  the  cximulant  expansion 
of  time-ordered  exponentials^  we  note  the  following  expansion  to  second  order  in  K: 


ln(exp  K 


f fij(z')dz')_  (exp 


K 


f\(z')dz') 
0 ^ 


2 ^ 1 

= kJ  dZj(fij(Zj)  - Z 


2 


K 


L («i(^i)«i(^2>^ 


- 2(nj|zj)fl^'z2)T+  (n^[z^)Cl^{zl^^  - (fljlzj)  (jljlz^)  - (16) 


+ 


We  apply  this  approximation  to  Eq.  (15)  with  the  substitutions 


kS2j(z')  = i V(£(z');z') 

Kfl2(^  ) " ^ V(£(z')  + £q  + (z  ' z)^f  K ; z') 


(17) 


At  this  stage  we  have  to  be  specific  about  the  statistical  properties  of  the  medium. 
We  assume  that  V(£,t)  is  a random  function  with  zero  mean. 


V(£,  z)  =0  (18) 

and  with  a two  point  correlation  function  of  the  form: 

V(£,z)  V(£',z')  = A(£  - £')  6(z  - z')  (19) 

that  is,  a homogeneous  process  with  zero  correlation  length  in  the  propagation  direction. 
Under  this  assumpti  n the  generating  function  for  the  moments  becomes. 
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\ 0 


G(K,£o.z)  = exp 
and  we  have: 

,(£0.*)  = C^K=o= 


-Lj'“dZj[A(0)  -A(£q+\K(z^-z)] 


iK  • £ 5 

< 0 1 e e 1 o>  (20) 


oo 


A(0)  - A(£q)‘ 


(21) 


" I^^K=o" 


z}  0A 


r (£.  o)  - z X ^ ^ ^ r^^(£^.  o)  | 

A(0)  - A(£q)' 


exp( 


z) 


(22) 


etc . 

Vlasov,  et  al."^  obtained  expressions  for  the  moments  of  the  mutual  coherence 
function  from  a recurrence  equation  which  was  derived  from  an  assumed  wave  equation 
for  the  mutual  coherence  function.  The  approximations  which  go  into  deriving  the  latter 
equation  are  parallel  in  our  work,  but  in  expressions  that  give  the  evolution  of  the 
moments  directly. 
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TWO-PHOTON  CORRELATIONS  IN  A WIDE  ANGLE  CHAOTIC  LIGHT  BEAM 
S.M.  Turner  and  D.B.  Scarl 

The  difference  in  time  of  arrival  between  single  photon  events  at  two  photomulti- 
pliers was  measured  in  a quasimonochromatic  spatially  coherent  widely  divergent  light 

1 . . 20 
beam.  A chaotic  (Gaussian)  light  source  was  provided  by  a pure  Ne  discharge  tube 

filled  to  a pressure  of  l.btorr  and  operated  at  a current  of  15mA.  This  experiment 

observed  the  Hanbury  Brown-Twiss  effect  in  light  from  the  585.  2 nm  line  of  neon  20, 

a line  in  which  the  effect  had  never  before  been  seen,  and  measured  the  two-photon  cor- 

2-10 

relation  function  at  much  larger  coherence  angles  than  had  previously  been  used. 

The  coherence  volume  of  an  optical  beam  has  usually  been  defined  as  the  volume 
of  a right  cylinder  whose  base  is  the  spatial  coherence  area  and  whose  height  is  the 
temporal  coherence  length.^  ^ An  angular  coherence  volume  can  be  defined  as  the  product 
of  the  solid  angle  subtended  by  the  spatial  coherence  area  (as  viewed  from  the  source) 
times  the  temporal  coherence  length.  The  angular  coherence  volume  has  the  advantage 
of  being  a property  of  the  source  alone,  since  it  is  independent  of  the  distance  from  the 
source  to  the  downstream  plane  at  which  the  spatial  coherence  is  evaluated.  It  is  also 
closely  related  to  the  volume  of  a single  cell  in  phase  space  for  the  light  leaving  the 
source.  For  a circular  aperture  of  diameter  a,  the  coherence  angle  can  be  defined  as 
X/a. 

Previous  experiments  have  been  carried  out  at  small  coherence  angles  and  have 
therefore  observed  fields  over  which  the  range  of  k values  was  small,  i.e.  , lAkj  < jk]. 
The  largest  coherence  angle  to  date  was  4 milliradians  as  compared  with  a coherence 
angle  of  292  mr  in  the  present  experiment.  This  increased  coherence  angle  allows  the 
sampling  of  a much  larger  portion  of  the  field  leaving  the  source  aperture  by  greatly 
increasing  the  range  of  k values  observed.  By  an  increase  in  the  coherence  angle  our 
experimental  purpose  became  threefold: 

(1)  As  a check  on  the  correctness  of  the  chaotic  density  matrix  to  second  order 

(2)  If  we  consider  our  coherence  volume  as  corresponding  to  one  cell  of  photon 
phase  space,  this  experiment  may  provide  additional  information  into  the 
spatial  and  temporal  extent  of  a photon 

(3)  As  a first  step  in  an  experiment  which  will  enable  us  to  look  at  a small  number 
of  excited  atoms. 

Sidelight  leaving  the  discharge  tube  was  limited  by  a first  circular  aperture  that 
consisted  of  a 2 micrometer  diameter  hole  in  a 1 fim  thick  nickel  foil.  A second  aperture, 
with  a diameter  of  12.7mm  placed  50mm  downstream  provided  the  effective  detector 
size  and  allowed  the  photomultipliers  to  detect  substantially  spatially  coherent  light. 

A 585.  3 nanometer  interference  filter  with  peak  transmission  of  50%  and  a full  width  at 
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half  maximum  of  3.0nm  assured  that  only  the  585.  2 nm  line  was  observed.  This  Doppler 
broadened  line  was  chosen  for  its  high  spectral  radiance;  the  results  of  a scanning 
Fabry-Perot  interferometer  gave  a spectral  width  of  2108  i 50  MHz  (FWHM). 

By  a (delayed)  coincidence  technique  with  a dual  discrimination  system  and  two 
photomultipliers  having  a Gaussian  time  resolution  function  with  a full  width  at  half  maxi- 
mum of  1.5ns,  a time  to  height  converter  recorded  a coincidence  count  whenever  two 
photon  events  occurred  at  both  detectors  within  a time  interval  of  20ns  of  each  other. 

The  output  pulses  from  the  time  to  height  converter  were  analyzed  and  stored  in  a pulse 
height  analyzer  calibrated  to  lOOps  per  channel  according  to  the  difference  in  the  time 
of  arrival  of  the  single  photon  events. 

The  experimental  results  are  shown  in  Fig.  1,  together  with  a curve  of  the  two- 


5600[- 


Fig.  1.  The  two-photon  counting  rate  R(At)  vs.  At  for 
light  from  a neon  discharge  passing  through  a 
2 micrometer  diameter  aperture.  The  solid 
Kne  is  the  prediction  of  a maximum  entropy 
density  matrix.  The  error  bar  shown  is  com- 
mon to  all  of  the  experimental  points. 

photon  correlations  to  be  expected  on  the  basis  of  a constant  energy  maximum  entropy 
density  matrix  for  the  chaotic  field.  The  measured  points  and  calculated  curve  are  in 
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reasonable  agreement,  showing  that  the  usual  chaotic  density  matrix  provides  a good 
representation  of  the  chaotic  electromagnetic  field  for  coherence  angles  at  least  as 
large  as  292  mr . 

The  Hanbury  Brown-Twiss  effect  was  observed  in  the  585.  2nm  spectral  line  of 
neon  for  a coherence  angle  of  292  mr,  representing  a solid  angle  approximately  4000 
times  greater  than  that  used  in  any  previous  second  order  correlation  function  measure- 
ment. The  large  coherence  angle  (1)  served  as  a check  on  the  choice  of  the  maximum 
entropy  constant  energy  density  matrix  to  second  order  by  allowing  the  observation  of 
a much  larger  portion  of  the  field  leading  to  an  increase  in  the  region  of  the  coherence 
volume,  and  (2)  allowed  the  investigation  of  the  effect  on  the  counting  statistics  from 
small  source  apertures,  i.e,,  large  coherence  angles. 
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AVALANCHE  INITIATED  OSCILLATIONS  IN  UNIPOLAR  STRUCTURES  UNDER 
PULSED  EXCITATION 

B.  Senitzky  and  S.  Gottfried 

Avalanche  in  semiconductor  devices  became  an  interesting  phenomenon  to  many 

people  when  J.B.  Gunn^  showed  that  the  avalanche  region  in  a semiconductor  could, 

under  the  proper  conditions,  cause  a negative  differential  resistance  in  the  devices' 

terminal  characteristics.  Useful  sources  of  microwave  energy  up  to  millimeter  fre- 

2-4 

quencies  were  developed,  based  on  this  phenomenon.  These  devices  were  called 
IMPATT  diodes  (an  acronym  for  Impact  Avalanche  Transit  Time). 

Several  years  later,  a negative  resistance  device  was  discovered  which  involved 
a different  mechanism,  but  was  avalanche  initiated.  Dubbed  the  TRAPATT  (Trapped 
Plasma  Triggered  Transit),  it  was  shown  to  oscillate  at  lower  frequencies  than  the 

5 

IMPATT,  but  at  much  higher  efficiency. 

At  around  the  same  time,  some  interest  was  generated  in  studying  avalanche 
phenomena  in  unipolar  devices.  Several  authors^  had  predicted  the  existence  of  a neg- 
ative differential  resistance  in  near  intrinsic  silicon  and  germanium.  In  1972,  Dworsky 
and  Harrison  proposed  an  analytical  model  for  an  n nn  structure  under  avalanche 
conditions.  The  analysis  provided  an  explanation  for  the  differential  negative  resistance 
but  there  was  little  experimental  evidence  to  support  the  existence  of  the  phenomenon. 

This  contribution  is  a report  of  the  experimental,  analytical  and  computer  investi- 
gation to  observe  and  study  the  avalanche  initiated  phenomena  occurring  in  an  n^nn^ 
structure  under  pulsed  conditions.  The  motivation  for  the  work  is  twofold;  the  n^nn^ 
structure  is  more  simple  to  fabricate  than  a junction  device,  and  therefore  has  the 
potential  of  being  a more  attractive,  useful  source  of  microwave  energy,  and  second  the 
study  of  the  n^nn^  device  could  improve  our  understanding  of  ohmic  contacts  under  high 
fields,  a situation  that  occurs  very  often  in  solid-state  device  and  integrated  circuit 
technology. 

A.  Device  Operation 

This  is  a simplified  explanation  of  the  physics  of  the  phenomena  that  occur  in  an 
n^nn^  structure  when  excited  with  a high-speed,  high-current  pulse.  The  effects  fol- 
lowing the  excitation  can  be  divided  into  three  periods; 

(1)  A charging  period  where  the  voltage  across  the  device  increases  rapidly 

(2)  A plasma  formation  period  in  which  the  diode  switches  to  a low  voltage, 
high-conductivity  state 

(3)  An  extraction  period,  in  which  the  plasma  is  swept  from  the  active  region 
and  the  diode  voltage  and  resistivity  increase. 
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B . Changing  Period 

Consider  an  n^nn^  structure,  in  which  a current  flows  which  requires  a larger 
electron  concentration  than  exists  in  the  n region.  The  electric  field  will  then  assume 
a triangular  profile,  as  shown  in  Figure  1.  Consider  a high-speed  pulse  of  current, 
with  amplitude  large  enoi'.gh  to  cause  avalanche.  (The  speed  and  amplitude  of  the  cur- 
rent pulse  are  defined  in  terms  of  the  semiconductor  properties  and  the  device  geom- 
etry.) In  order  to  support  this  high-current  level,  a pulse  of  electrons  will  be  injected 
from  the  cathode  into  the  n region.  Ahead  of  this  pulse,  the  current  is  supported  by  a 
uniform  rise  of  the  electric  field  (i.e.  , a displacement  current). 


under  high  current  conditions. 

Since  the  field  peaks  at  the  anode  (Fig.  1),  the  rapid  increase  will  cause  avalanch- 
ing to  occur  there  first  (assuming  the  rise  is  rapid  enough  so  that  the  electron  pulse 
from  the  cathode  has  not  reached  the  anode  yet).  As  the  field  keeps  increasing,  the 
point  at  which  avalanche  occurs  will  sweep  across  the  n region  (from  anode  to  cathode) 
at  a velocity  which  is  different  than  the  saturated  carrier  velocity.  (For  high  efficiency 
this  velocity  should  be  greater.) 

C.  Plasma  Formation  Period 

At  the  avalanche  point,  a very  high  density  of  electrons  and  holes  form.  For  the 
above  conditions,  the  holes  are  left  behind  the  avalanche  zone.  Since  the  electrons  will 
drift  towards  the  anode  and  holes  will  follow  behind  the  avalanche  zone  towards  the 
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cathode,  the  result  is  a net  positive  charge  behind  the  moving  zone.  This  causes  the 
slope  of  the  field  to  reverse  and  decrease  behind  the  moving  zone.  For  the  device  para- 
meters considered  here  the  field  at  this  point  will  drop  to  a small  value. 

Under  the  influence  of  the  small  electric  field  behind  the  zone,  a high  density, 
slowly  moving  plasma  is  formed  and  spreads  out  over  most  of  the  device.  When  the 
moving  avalanche  zone  meets  the  electron  pulse  from  the  cathode,  the  avalanche  will 
quench  and  there  is  no  generation  of  charges.  The  result  is  alow  voltage  and  high  current 
in  the  device  terminal  characteristic.  Since  the  voltage  has  decreased  while  the  cur- 
rent remained  constant,  a differential  negative  resistance  is  realized. 

D.  Plasma  Extraction  Period 

Under  the  influence  of  the  small  electric  fields,  the  electrons  and  holes  will  move 
slowly  towards  the  anode  and  cathode,  respectively.  In  the  region  behind  the  moving 
holes,  the  net  space  charge  is  negative  and  the  electric  field  will  assume  an  increasing 
negative  slope.  As  the  slope  increases,  the  velocity  of  the  carriers  will  increase  and 
the  terminal  voltage  is  rising.  Since  the  current  pulse  is  held  constant,  the  result  is 
an  increase  in  device  terminal  resistance. 

The  device  terminal  characteristics  calculated  by  a piecewise  linear  analysis  based 
on  the  above  arguments  are  summarized  in  Figure  2.  It  can  be  shown  that  the  ratio  of 


Fig.  2.  Theoretical  device  current  and 
voltage  versus  time. 
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the  first  harmonic  components  of  the  current  and  voltage  are  out  of  phase,  and  hence 
the  device  will  exhibit  a differential  negative  resistance  in  its  terminal  characteristic. 

A numerical  solution  on  a computer  of  the  Maxwell  Equations  in  one  dimension 
for  the  boundary  conditions  of  an  n^nn^  structure  resulted  in  the  waveforms  shown  in 
Figure  3.  The  computer  generated  waveforms  are  similar  to  the  ones  calculated  in  the 
piecewise  linear  approximation  summarized  in  Figure  2.  The  numerical  analysis  is 
not  limited  by  many  of  the  assumptions  of  the  piecewise  linear  theory  and  agreement 
provides  good  support  of  the  simplified  analysis. 


Fig.  3.  Computer  generated  device  current 
and  voltage  versus  time. 

E.  Experiment 

The  n^nn^  wafers  were  fabricated  in  the  laboratory  and  single  devices  cut  from 
the  wafers  were  mounted  in  a structure  shown  in  Figure  4.  The  devices  were  bonded 
to  the  bottom  of  the  package  and  ultrasonic  bonding  was  used  to  contact  the  top  of  the 
device . 

The  circuit  used  to  pulse  the  devices  is  shown  in  Figure  5.  Subnanosecond  rise- 
time current  pulses  of  amplitudes  large  enough  to  drive  the  devices  into  avalanche  are 
available. 

The  mechanism  of  the  negative  resistance  was  studied,  as  well  as  the  effects  of 
device  thickness,  area,  surface  preparation,  current  amplitude,  and  doping  density  on 
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the  rise  time,  plasma  extraction  time,  negative  resistance,  efficiency,  lifetime,  and 
stability  of  the  devices  fabricated. 

F.  Results 

The  pulsed  I-V  characteristics  as  observed  using  a sampling  oscilloscope  of  the 
trapped  plasma  devices  under  pulsed  conditions  is  shown  in  Figure  6.  This  is  the  first 
time  that  these  results  have  been  obtained.  The  photograph  shows  the  pulsed  current 
excitation  and  the  resultant  terminal  voltage  as  a function  of  time. 


Fig.  6.  Experimentally  observed  device  current 
and  voltage  versus  time. 

G.  Comparison  with  Theory 

As  can  be  seen  by  comparing  Figs.  6 and  2,  the  experimental  wa\eforms  are 
similar  to  those  predicted  by  the  thecry.  The  two  major  differences  betv'een  the  theory 
and  observed  waveforms  are  the  voltage  rise  and  fall  times  after  current  pulsing  and 
the  voltage  amplitude  at  the  minimum.  Both  of  these  effects  are  due  to  the  0.  4 nano- 
f-  ond  rise  time  ot  the  sampling  oscilloscope  used.  At  the  present  time,  the  voltage 
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waveforms  are  being  deconvolved  with  the  impulse  response  of  the  oscilloscope,  to 
determine  the  original  voltage  signals.  The  results  of  this  operation  should  show  a 
closer  agreement  between  the  theory  and  experiment. 

The  harmonics  of  the  observed  waveforms  are  presently  being  analyzed  to  deter- 
mine the  magnitudes  of  the  negative  resistance  observed  and  a predicted  oscillation 
efficiency  (as  determined  by  the  relative  magnitude  of  the  first  harmonic  component). 

H.  Conclusion 

The  experimental  results  show  good  agreement  with  the  theoretically  predicted 
terminal  voltages  and  currents  of  the  n^nn^  device  under  pulsed  excitation.  The  device 
is  shown  to  be  a useful  source  of  microwave  energy. 

Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  B.  Senitzky  and  S.  Gottfried 

REFERENCES 

I.  J.B.  Gunn,  "Avalanche  Injection  in  Semiconductor s , " Proc.  Physical  Society 
(London),  B69,  781-790  (1956). 

2.  W.T.  Read,  Jr.,  "A  Proposed  High  Frequency  Negative  Resistance  Diode,"  Bell 
System  Technical  Journal,  3_7^,  401  (1957). 

3.  R.  L.  Johnston,  B.C.  DeLoach  and  B.  G.  Cohen,  "A  Silicon  Diode  Microwave  Oscil- 
lator, " Bell  Systems  Techical  Journal,  369  (1965). 

4.  C.A.  Lee,  R.  L.  Batdorf,  W . Wiegmann  and  G.  Kaminsky,  "Technological  Develop- 
ments Evolving  from  Research  on  Read  Diodes,  " IEEE  Trans,  on  Electron  Devices, 
ED-13,  No.  1,  175  (January  1966). 

5.  H.J.  Prager,  K.K.N.  Chang  and  S.  Weisbrod,  "High  Power  High  Efficiency  Silicon 

Avalanche  Diodes  at  Ultra  High  Frequencies,  " Proc.  IEEE,  586-687  (April 

1967). 

6.  J.N.  Park,  K.  Rose  and  K.E.  Mortenson,  "Avalanche  Breakdown  in  Near  Intrinsic 
Silicon  and  Germanium,  " J.  Appl.  Phys.  , 38 , No.  13  (1967). 

7.  L.N.  Dworsky  and  R.I.  Harrison,  "Trapped  Plasma  Oscillations  in  Unipolar  Semi- 
conductor Structures,"  IEEE  Trans,  on  Electron  Devices,  ED- 1 9,  No.  6,  836-838 
(June  1972). 


[ 

k 

f, 

SOLID  STATE  AND  MATERIALS 


146 


SOLID  STATE  AND  MATERIALS 


HALL  EFFECT  IN  SIUCON  INVERSION  LAYERS 
Y.  L.  Yao  and  B.  Senitzky 

The  effect  of  transverse  magnetic  field  on  a silicon  inversion  layer  has  been 
studied  at  liquid  nitrogen  temperature.  The  results,  similar  to  that  of  the  Hall  effect, 
showed  that  the  channel  conductance  varied  both  linearly  and  quadratically  with  the 
applied  magnetic  field.  Using  thick  oxide  MOS  devices,  the  maximum  change  of  channel 

conductance  of  10  percent  has  been  observed  at  15  kG.  This  observed  value  is  much 

2 

larger  than  the  theoretical  values  obtained  from  either  the  classical  model  or  quantum 

3 

mechanical  analysis. 

A 

This  phenomenon  could  be  of  interest  in  the  study  of  charge  coupled  devices  (CCD). 
It  has  been  established  that  one  of  the  factors  which  prevent  ideal  charge  transfer  in  a 
CCD  is  the  loss  of  charge  to  surface  states  at  the  Si-SiO^  interface.  The  variation  of 
channel  conductance  with  magnetic  field  suggests  that  the  moving  carriers  in  the  in- 
version layer  are  being  pushed  toward  or  away  from  the  surface  by  the  Lorentz  force, 
thus  causing  a decrease  or  increase  of  conductance.  However,  devices  used  in  previous 
experiments  were  not  suitable  for  DDC's  because  of  the  thick  gate  oxide.  Consequently, 
we  experimented  with  MOS  devices  with  gate  oxide's  whose  thickness  comparable  to  that 
of  CCD's . 

In  this  contribution  we  will  describe  our  measurements  of  channel  conductance 
variation  of  a MOS  device  at  liquid  nitrogen  temperature  and  magnetic  fields  up  to  a 
maximum  of  15kG.  We  will  then  compare  these  measurements  to  surface  scattering 
model  and  show  that  this  model  is  in  good  agreement  with  our  experimental  data. 

A.  Previous  Physical  Interpretations 

A first  order  classical  analysis  of  this  Hall  effect  was  first  given  by  Tansal.^  He 
stated  that  the  number  of  carriers  in  an  inversion  layer  was  modified  by  the  magnetic 
field  and  thus  resulted  in  a change  of  channel  conductance.  However,  there  are  several 
problems  with  this  approach.  First,  it  is  not  clear  how  the  number  of  carriers  in  an 
inversion  layer  is  affected  by  magnetic  field.  Secondly,  this  approach  results  only  in 
a linear  relation  of  the  channel  conductance  and  the  magnetic  field. 

A quantum  mechanical  approach  to  determine  the  Hall  voltage  of  a thin  current 

3 5 

layer  under  a magnetic  field  was  given  by  Stern.  He  limited  his  analysis  to  the  case 
where  all  carriers  are  in  the  lowest  level  (n=  0).  The  average  distance  of  the  carriers 
from  the  surface  is  calculated  both  for  zero  and  finite  magnetic  field.  Using  second 
order  perturbation  theory,  the  change  of  this  distance  with  magnetic  field  is  calculated. 
The  modified  surface  potential  can  then  be  computed  from  this  change.  This  modified 
potential  is  referred  by  Stern  as  the  quantum  mechanical  Hall  voltage. 
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A first  order  estimate  of  the  channel  conductance  variation  can  be  made  by  adding 
the  quantum  mechanical  Hall  voltage  directly  to  the  effective  gate  voltage  and  computing 
the  conductance  from  the  relation  between  the  conductance  and  the  gate  voltage.  How- 
ever, the  computed  conductance  change  is  much  smaller  than  the  observed  value. 

B.  Surface  Scattering  Model 

We  propose  a different  model  which  accounts  for  the  changed  carrier  mobility. 

It  is  well  known  that  carriers  in  an  inversion  layer  are  diffusely  scattered  from  the 
oxide  surface  and  thus  show  lower  mobility  than  bulk  material.  We  used  a simple  clas- 
sical model  similar  to  that  suggested  by  Fang  and  Triebwasser  and  assumed  that; 

(1)  the  density  of  the  free  carriers  is  uniform  in  the  channel,  and 

(2)  the  gate  oxide  is  free  of  charge. 

To  simplify  the  calculation,  we  selected  a carrier  initially  at  rest  at  the  interface 
of  the  inversion  and  the  depletion  layers  and  moving  toward  the  surface.  We  then  cal- 
culate the  time  for  the  carrier  to  reach  the  surface  with  and  without  a magnetic  field. 

Assuming  that  this  carrier  reaches  the  surface  with  no  intermediate  scattering 
and  its  effective  mass  remains  constant,  we  can  compute  the  mobility  from  this  transit 
time.  The  resultant  channel  conductance  variation  due  to  magnetic  field  shows  both  the 
linear  and  quadratic  relationship  to  the  magnetic  field.  More  important,  the  general 
shape  of  the  resultant  curve  agrees  well  with  observations  as  will  be  shown. 

C.  Experiments  and  Results 

Figure  1 shows  the  device  structure  used  in  this  study.  The  structure  is  a linear 
n-channel  MOS  transistor  with  source-drain  spacing  of  2 mil  and  a channel  width  of 
40  mil.  The  channel  is  measured  at  the  source  contact.  The  rectification  properties 
of  the  source  and  drain  diffusions  allow  only  one  type  of  carriers  (the  majority  carriers 
or  electrons  in  this  case)  to  flow.  Figure  2 shows  the  electrical  circuit  used  in  the 
experiment. 

The  devices  were  mounted  on  a non-magnetic  substrate  with  brass  pins  for  elec- 
tric connections.  A special  liquid-nitrogen  dewar  was  designed  to  allow  the  device  to 
be  placed  in  a Varian  electromagnet  with  a 20mm  airgap.  Figure  3 shows  the  physical 
experiment  set-up. 

A typical  curve  showing  the  variation  of  the  dc  source -drain  current  as  a function 
of  magnetic  field  at  fixed  source-drain  and  gate  voltages  is  shown  in  Figure  4.  The 
vertical  bar  at  each  data  point  represents  the  variation  of  each  measurement  due  to 
noise.  The  smooth  curve  is  a least  square  fit  of  the  experimental  data  to  a second  order 
polynomial. 
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Fig.  5.  Comparison  of  different  theoretical  results 
and  the  experimental  values. 

The  shape  of  the  conductance  vs.  magnetic  field  curves  is  reproducible  and  is  sim- 
ilar to  that  previously  measured^  but  an  order  of  magnitude  smaller.  In  our  opinion 
we  feel  that  our  results  are  more  reliable  because  our  device  parameters,  such  as  oxide 
quality,  are  more  carefully  controlled.  These  new  experimental  results  are  in  closer 
agreement  to  the  theoretical  values  predicated  by  models  described  above.  Figure  5 
shows  a comparison  of  typical  experimental  results  with  the  different  theoretical  values 
calculated  for  the  same  experimental  conditions. 

D.  Summary 

An  experimental  study  of  the  magneto  conductance  of  an  inversion  layer  was  made 
with  N -channel  MOS  devices  on  100  ohm -cm  substrate  material.  A 1.2  percent  change 
in  channel  conductance  was  observed  at  a dc  magnetic  field  of  1 5 kG  applied  tangentially 
to  the  surface  and  perpendicularly  to  the  current  flow.  This  is  an  order  of  magnitude 
smaller  than  previous  experimental  results^  using  similar  devices  with  thick  oxide 
gates . 
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A surface  scattering  model  based  on  the  theory  of  variable  carrier  mobility  was 
presented.  This  model  yields  both  the  linear  and  quadratic  dependence  of  channel  con- 
ductance on  magnetic  field.  The  agreement  with  the  experimental  results  is  much  better 
than  that  achieved  with  previous  models.  Nevertheless,  we  feel  that  further  experiment 
is  required  in  this  area  to  determine  the  exact  nature  of  this  phenomena. 
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LARGE  SIGNAL  TRANSIENT  RESPONSE  OF  MOSFETS 
R.  Kinasewitz  and  B.  Senitzky 

Although  the  metal-oxLde-semiconductor  field-effect  transistor  (MOSFET)  has 
been  in  use  for  a long  time,  most  studies  to  date  have  been  directed  towards  its  steady 
state  and  small  signal  operation.  The  reason  for  this  is  the  difficulty  of  analyzing  the 
nonlinear  physical  phenomena  involved  in  the  large  signal  transient  conditions,^ 

All  investigations  formerly  published  on  the  transient  behavior  of  the  MOSFET 

have  been  subjected  to  various  restrictions.  For  example,  some  investigators  assume 

2 

the  drain-to-source  voltage  to  be  zero,  or  consider  only  very  long  channels  where  the 

2 3 

parasitic  capacitances  are  of  little  influence.  ’ Others  use  oversimplified  physical 
4 

assumptions . 

Our  investigation  is  directed  towards  identifying  and  quantifying  some  of  the  phys- 
ical phenomena  which  are  particularly  significant  with  respect  to  the  turn-on  behavior 
of  MOSFETS, 

A.  Experiment 

Our  experiment  consists  of  exciting  a MOSFET  with  a step  input  in  the  circuit 
configuration  shown  in  Fig.  1 and  measuring  the  output.  The  excitation  is  obtained  from 
the  negative  step  calibration  output  of  a Tektronix  Type  519  Oscilloscope.  This  step 
waveform  has  a rise  time  of  about  100-150  picoseconds.  The  MOSFET  circuit  of  Fig.  1 
is  installed  in  a General  Radio  Type  87  4-X  Insertion  Unit.  The  output  waveform  is  ob- 
served on  a Taktronix  Type  564  Storage  Oscilloscope  containing  the  Type  3T77  Sampling 
Sweep  and  Type  3576  Vertical  Sampling  plug-in  units.  The  Type  3576  plug-in  unit  has 
a specified  rise  time  of  less  than  400  picoseconds. 


GENERAL  INSTRUMENTS 
MEM  806 
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Fig.  1. 
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B.  Results 


A typical  set  of  output  waveforms  is  shown  in  Figure  2. 


^/ERTICAL  AXIS  /OL  TAGF  OLTf-L'r  fbOO  mWL  v' 
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Fig.  2. 
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negative -going  undershoot  portion  of  the  response  has  been  observed  and  is 
arise  from  two  mechanism s ^ 


One  mechanism  is  the  parasitic  capacitances;  the  gate-to-source  capacitance 
(Cqs)’  gate-to-drain  capacitance  (Cq^),  and  the  drain-to-source  capacitance  (C^g), 
(see  Figure  3).  The  parasitic  capacitance  effect  on  the  MOSFET  transient  can  be  esti- 
mated by  considering  the  circuit  of  Fig.  4 and  noting  that  the  output  response  of  this 
circuit  to  a negative  step  input  is  that  shown  in  Fig.  5,  which  is  similar  to  the  observed 
undershoot. 


Fig.  3. 


Ri  Cgd 


The  other  proposed  reason  for  the  undershoot  is  that  at  the  beginning  of  the  trans- 
ient the  holes  in  the  inversion  layer  are  partially  supplied  from  the  drain  and  therefore 
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Fig,  5. 


the  initial  drain  current  is  reversed.  This  drain  current  reversal  is  not  widely  known 
or  well  understood  and  we  would  like  to  investigate  it. 

The  positive  going  overshoot  of  Fig.  2 has  not  been  observed  and  may  be  due  to 
the  following:  Since  the  mobile  space  charge  density  in  the  channel  inversion  layer  is 
a function  of  the  drain-to-source  voltage,  the  channel  has  to  acquire  and  dispose  of  part 
of  its  mobile  space  charge  before  reaching  steady  state. 

Work  will  be  continued  on  the  above  phenomena  and  to  this  end  provisions  are  being 
made  to  use  a faster  detection  system. 

Joint  Services  Technical  Ad'  isory  Committee 

F44620-74-C-0056  R,  Kinasewitz  and  B.  Senitzky 


REFERENCES 

1.  T.W.  Collins,  Ph.  D.  Dissertation,  University  of  California,  Davis  (June  1973). 

2.  K.  Goser,  Arch.  El.  Ubertr.  , 21  (1970). 

3.  D.  Landgraf-Dietz,  Nachrichtentechnik,  296  (1968). 

4.  M.E.  Zahn,  "Solid-State  Electronic s ,"  L^,  843  (1974). 

5.  F.  Grimmer  and  K.  Goser,  Arch.  El.  Ubertr,  197  (1972). 

6.  P.  Richman,  "MOS  Field-Effect  Transistors  and  Integrated  Circuits,  " (Wiley- 

Interscience,  1973). 


j 

i 

\ 

f 

i 


SOLID  STATE  AND  MATERIALS 


155 


REUABLE  OHMIC  CONTACTS  TO  SlUCON  DEVICES 
M.  Eschwei  and  S.  Gottfried 

An  important  step  in  the  fabrication  of  silicon  devices  is  the  formation  of  an  ohmic 

1-3  45 

contact  to  the  device.  Silicon  oxides  and  other  contaminants  on  the  surface,  ’ as 

well  as  processing  induced  impurities,  will  preclude  reliable  contacts  with  good  elec- 
trical characteristics. 

Many  authors  have  studied  the  processes,  materials,  and  results  of  making  ohmic 

Z 5 

contacts  to  semiconductor  structures.  ’ Reliable  adhesion  of  the  metal  to  the  silicon, 

and  ease  of  bonding  to  the  contact  are  of  major  concern.  Gold  is  a desirable  contact 

material  because  of  its  resistance  to  corrosion,  ease  of  bonding,  and  high  elongation 

(allowing  thermal  expansion  mismatch  with  the  silicon).^  The  adherence  of  gold,  and 

all  precious  metals,  to  silicon  is  however,  very  poor.  Chromium  has  been  used  as  a 

5 6 7 

precoating  to  improve  the  adhesion.  Sputtering  several  layers  of  material  ’ provides 
a good  contact  to  the  silicon,  but  is  a complicated  technique.  Glow  discharge  cleaning 

just  prior  to  film  deposition  improves  adhesion,  but  this  may  add  interface  contami- 

8 9 

nants.  Other  processes,  such  as  electroless  nickel  plating,  are  less  reliable. 

Having  tried  various  wafer  cleaning  and  etching  processes,  and  contact  formation 
by  thermal  evaporation  and  electroplating  of  different  metals,  we  have  developed  a 
simple,  reliable  technique  which  does  not  require  any  expensive  doped  gold  targets. 

It  provides  ohmic  contacts  to  silicon  devices  with  excellent  adhesion  and  repeatable 
electric  properties. 

We  decided  to  try  the  sputter-etch  technique  of  removing  surface  impurities  for 
obtaining  good  ohmic  contacts  with  strong  adhesjon,^*^  followed  immediately  by  hot  fila- 
ment evaporation  of  the  contact  material.  Using  this  process  we  avoid  buying  an  expen- 
sive, limited  use  doped  gold  target.  A Materials  Research  Corporation  86Z0  RF  Sput- 
tering Module  which  was  collar -mounted  on  a 6"  oil  diffusion  pumped  system  was  mod- 
ified. A high  current  feed  thru  has  been  installed  in  the  sputtering  module  collar  to 
supply  power  to  a resistively  heated  gold-antimony  source.  This  filament  source  is 
mounted  in  a sheet  stainless  steel  housing  located  next  to  the  etching  platform  in  the 
sputtering  unit.  Proper  shielding  keeps  the  rest  of  the  system  from  being  coated.  The 
stranded  tungsten  filament  is  roughly  »W"  shaped  to  fit  the  small  area  available.  The 
filament  is  wound  with  18  cm  of  0.  25mm  diameter  99.  95  gold  wire  over  76  cm  of  0.  10mm 
diameter  gold  wire  doped  with  0.  6 percent  antimony.  After  detergent  and  solvent  ultra- 
sonic cleaning,  it  is  mounted  in  the  filament  housing  and  the  system  is  evacuated  for 
premelting  in  vacuum.  This  removes  any  sxirface  impurities  which  might  be  left  after 
ultrasonic  cleaning,  and  it  also  makes  the  gold-antimony  wet  and  flow  along  the  filament 
for  better  coverage  of  the  wafer. 
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A "pusher"  has  been  fastened  to  the  movable  shutter  supplied  with  the  module. 
Using  this  shutter,  the  wafer,  as  shown  in  Fig.  1,  can  then  be  pushed  from  the  sputter- 

1 TEST  STRIP 

2 LEAD 

3 THICKNESS  PLATE 

4 larger  wafer 

5 WAFER 


6 HOUSING 

7 FILAMENT 

8 platform 

9 PUSHER 

IOMOVAE..E  SHUTTER 
I I PEDESTAL 
12  sputter-etch  ELI 

Fig.  1.  Diagram  of  parts  added  and  setup. 

etch  platform  over  and  under  the  filament  for  downward  film  deposition  immediately 
after  sputter -etching . The  wafer  to  be  made  into  a device  is  itself  placed  on  a larger 
silicon  wafer  which  in  turn  is  on  a "pedestal"  of  3mm  thick  glass.  The  larger  wafer 
protects  the  device  wafer  from  any  back- sputtering  impurities  from  the  "pedestal"  or 
the  aluminum  target  plate.  The  thick  glass  raises  the  wafers  and  provides  enough 
thickness  for  the  "pusher"  to  contact  and  move  the  wafer  into  position  under  the  fila- 
ment. 

Mounted  on  a demountable  platform  under  the  filament  housing  is  a glass  test 
strip  with  fired-on  platinum -gold  contacts.  A mica  mask  is  used  to  give  a film  area 
of  about  40  squares.  This  allows  the  film  thickness  to  be  monitored  during  the  evapora- 
tion, since  it  results  in  a film  with  a resistance  high  enough  to  be  read  on  an  ohmmeter 
for  a thickness  of  6000  to  7000  angstroms.  A partially  masked  piece  of  microscope 

slide  glass  is  also  mounted  on  the  platform  so  that  the  film  thickness  step  can  be  check- 

11  12 

ed  out  later  by  both  Tolansky  interferometry  and  Talysurf  profilometry  measure- 
ments . 

The  glass  test  strip  and  the  thickness  plate  are  cleaned  and  mounted  on  the  de- 
mountable platform,  which  is  then  put  in  the  system.  Different  cleaning  methods  were 
used  for  the  wafer,  after  which  it  was  immediately  placed  on  the  "pedestal"  and  the 
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system  evacuated.  Standard  procedures  were  used  for  pumping  down  and  flushing  the 
system  with  ultra  pure  argon.  Approximately  5500  angstroms  of  the  n-type  silicon 
wafer  was  etched  from  the  top  surface  at  150  Watts  forward  power  applied  for  55  min- 
utes. An  argon  atmosphere  is  maintained  at  0.86  Pascals  (6.  5 microns)  by  throttling 
back  the  high  vacuum  valve.  Immediate  opening  of  the  high  vacuum  valve  on  completion 
of  the  etching  brings  the  pressure  down  to  less  than  8.  8x  10*"^  Pascals  (6.  6x  10’^  Torr). 
The  wafer  is  then  pushed  under  the  filament  for  the  immediate  deposition  of  the 
antimony -gold.  Approximately  80  amps  filament  current  brings  the  test  strip  resist- 
ance to  about  4 Ohms  in  5 seconds,  corresponding  to  a film  thickness  of  approximately 
7000  angstroms. 

The  two  sides  of  the  wafer  must  be  done  in  separate  runs.  There  is  no  apparent 
degradation  of  the  first  side  with  the  sputter-etch  and  deposition  of  the  second  side. 

The  sputter -etch  plus  vacuum  deposition  technique  was  used  to  form  contacts  on 
n-type  epitaxial  silicon  wafers.  The  50  Ohm  centimeter  resistivity  epitaxial  layer  is 
formed  on  an  n^  substrate,  with  a resistivity  of  .001  Ohm  centimeter.  Two  sets  of 
wafers  were  prepared.  One  set  was  etched  approximately  6.  5 minutes  in  Buffer  HE 
solution,^^  and  the  second  set  was  etched  15  seconds  in  a solution  of  nitric,  hydrofluoric, 
and  acetic  acids.  The  Buffer  HF  removes  only  svirface  oxides,  while  the  acid  mixture 
actively  etches  the  wafer.  The  wafers  were  thoroughly  washed  in  deionized  water  and 
blown  dry  with  dry  nitrogen.  As  described  above,  these  wafers  were  then  sputter - 
etched,  followed  by  the  vacuum  deposition  of  the  antimony  doped  gold  wire.  As  a con- 
trol, contacts  were  formed  on  two  other  sets  of  wafers  using  the  same  chemical-etch 
cleaning  procedures,  except  that  these  were  not  sputter-etched.  They  were  simply 
loaded  into  a conventional  vacuum  system,  immediately  evacuated,  then  the  antimony 
doped  gold  film  was  deposited  from  a molybdenum  strip  boat  by  thermal  evaporation  at 
8.8x10  ^ Pascals . 

The  I-V  characteristics  of  the  four  sets  of  slices  were  compared  using  a diode 
curve  tracer.  The  typical  I-V  characteristics  of  the  wafers  that  were  sputter -etched 
are  shown  in  Figure  2a.  The  control  groups' I-V  characteristics  were  typically  that  as 
shown  in  Figure  2b.  As  can  be  seen,  those  with  the  sputter  etching  done  prior  to  the 
vacuum  deposition  provide  a good  ohmic  contact,  while  the  others  have  very  "leaky" 
blocking  type  contacts.  The  results  were  very  consistent. 

The  adhesive  tape  tests^"^  and  scratch  tests^^’  for  film  adhesion  were  performed 
on  all  the  wafers.  The  sputter  etched  groups  were  the  only  ones  that  maintained  good 
adhesion  through  out  all  the  tests.  The  gold  contact  on  the  other  wafers  peeled  off  with 
adhesive  tape,  and  with  diamond  scoring,  making  it  impossible  to  maintain  contaitr  t- 
make  devices.  Leads  were  easily  attached  to  the  strongly  adherent  gold  contait;  b 
thermocompression  bonding. 
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Using  this  fabrication  technique,  unipolar  silicon  devices  were  made  for  further 
investigation.  See  report  "Avalanche  Initiated  Oscillation  in  Unipolar  Structures  under 
Pulse  Excitation"  by  B.  Senitzky  and  S.  Gottfried  in  this  report. 


Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  M.  Eschwei  and  S.  Gottfried 
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NEW  SOU D STATE  MATERIALS 

E.  Banks,  M.  Greenblatt,  B.R.  McGarvey,  S.  Nakajima,  M.  Shone  and  G.  Torre 

Research  in  the  area  of  new  materials  is  devoted  to  synthesis,  crystal  growth  and 
characterization  of  materials  which  have  interesting  electrical,  optical  and  magnetic 
properties,  which  may  provide  a basis  for  new  optical,  electro-optic  and  magneto-optic 
devices.  Section  A describes  the  continuation  of  studies  of  complex  transition  metal 
fluorides.  Section  B briefly  discusses  studies  of  excitation  of  infrared -visible  conver- 
sion  in  CdF2(Er  , Yb  ).  NMR  studies  of  Er-F  interactions  are  described,  which  will 
be  useful  in  determining  the  mechanism  of  energy  transfer  in  these  crystals. 

A.  Complex  Transition  Metal  Fluorides 


The  research  in  this  section  involves  the  investigation  of  ternary  fluorides  of 
alkali  metals  and  transition  metals,  of  the  general  formula  AjjM^M^^F^  (A=  K,  Rb, 

Cs;  M^=  Fe’*’'*',  Mn"'"*’,  Cr“,  etc.,  M^=  Fe'*"^'*',  Cr'*"’’'*’,  V'*"'’'*’,  etc.).  The  compounds 
have  the  "tetragonal  tungsten  bronze"  structure  for  x between  0.4  and  0.  6,  and  we  have 
focused  on  this  region  because  of  the  well-known  ferroelectricity  and  non-linear  optic 
behavior  of  oxides  having  this  structure,  the  possibility  of  studying  ferromagnetic  and 
antiferromagnetic  exchange  interactions  in  a complex  structure,  and  the  possibility  that 
conductive  phases  may  develop  in  such  systems  when  mixed  valence  states  of  the  same 
element  are  present.  The  latter  has  been  ruled  out  for  the  cases  of  Fe,  Cr  and  V, 
leaving  Ti  as  the  only  remaining  possibility.  Previous  work  has  shown  antiferro- 
magnetic behavior  in  most  of  the  phases,  except  for  one  case  which  appears  to  be  ferri- 
magnetic  below  80 K.  Mossbauer  studies  of  Kq  ^FeF^  showed  evidence  of  preferential 
substitution  of  Fe^"*^  in  one  of  the  sites  in  the  unit  cell. 

A Mossbauer  study  of  selected  compositions  in  tetragonal  bronze  phases  was 

undertaken  to  determine  the  extent  of  preferential  filling  of  the  (2c)  sites  relative  to  the 

2+ 

(8j)  sites.  Previous  evidence,  in  K„  ^FeF-,  showed  the  Fe  spectrum  to  be  broadened 
3+  U.  D j 2+ 

relative  to  the  Fe  spectrum,  suggesting  that  the  Fe  ions  preferentially  filled  the 

3+  2+ 

(2c)  sites  while  the  Fe  and  the  remaining  Fe  ions  were  randomly  distributed  over 
the  (8j)  sites. 

A series  of  samples  were  prepared  of  composition  x=  0.  5,  some  chosen  to  have 

only  Fe^"*”,  others  only  Fe^"*"  and  others  to  have  just  enough  Fe^^  to  fill  all  the  (2c)  sites 

if  there  were  an  absolute  preference  for  these  sites  by  Fe^^  in  competition  with  all 

other  divalent  and  tri valent  ions  in  the  sample.  Mossbauer  spectra  of  all  samples  were 

taken  at  room  temperature.  One  sample,  KFep  also  run  at  liquid 

nitrogen  temperature.  This  spectrum  showed  a peak  due  to  Fe^^,  probably  due  to  some 

2+ 

FeFj  in  the  starting  material.  Other  samples  prepared  to  contain  only  Fe  also  showed 
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3^ 

the  presence  of  Fe  as  additional  absorption  in  the  region  of  the  low  velocity  member 
2+ 

of  the  Fe  doublet.  This  necessitated  the  calculation  of  area  ratios  based  on  the  high 
velocity  component  only.  This  procedure  is  valid  in  the  absence  of  strong  anisotropy 
of  the  quadrupole  splitting  (Goldanskii-Karyagin  effect).  Areas  of  peaks  and  Mossbauer 
parameters  were  computer -fitted  to  the  observed  spectra.  The  Fe^"*^  samples  showed 
partial  resolution  of  the  peaks  due  to  the  (2c)  and  (8j)  sites,  which  made  is  possible  to 
calculate  area  ratios  to  be  compared  to  predictions  based  on  random  occupancy  com- 
pared to  preferential  occupancy. 


The  results  of  this  calculation  are  shown  below.  The  area  ratios  are  expressed 


2+ 

in  terms  of  the  ratio  of  the  population  of  Fe  in  the  more  abundant  (8j)  to  the  less  abun> 

dant  (2c)  site. 

Composition 

2+ 

Ratio  of  Fe  Areas 

Numerical  Ratio 

KFeVF^ 

0.28/0.  19 

1. 47 

KFeCrF^ 

0.  25/0. 18 

1. 40 

KFeo.4^8o.6^^6 

0.25/0.20 

1.25 

0.20/0.  16 

1.25 

0. 12/0. 08 

1.5 

2+ 

If  Fe  were  randomly  substituted  in  the  two  sites, 

the  ratio  expected  would  be 

2+ 

4:1,  the  ratio  of  the  abundance  of  available  sites.  If  the  Fe  had  100%  preference  for 
the  (2c)  sites,  the  ratio  would  be  3 '2  for  samples  containing  one  Fe^^  per  formula  unit 
and  zero  for  those  that  contain  0.  4 Fe^'*’  per  unit,  as  this  would  be  the  amount  needed 

to  fill  all  the  (2c)  sites.  The  observed  ratios  approach  the  value  of  3:2  for  samples 

2+  2+ 
where  one  Fe  is  present,  and  even  in  samples  where  0.  4 Fe  is  present.  It  is  signi- 
ficant that  the  preference  does  not  appear  due  to  the  dipositive  charge,  or  ionic  radius, 
as  competing  divalent  ions.  Mg"*"*”  and  Mn"*”^  were  present.  Perhaps  the  (2c)  site  is  less 
symmetrical,  or  is  more  easily  distorted  by  the  unsymmetrical  electronic  distribution 
of  the  Fe^^  ion,  both  Mg"*"*^  and  Mn"*^^  being  spherically  symmetrical. 

B.  Upconversion  in  CdF2:YbF2:ErF2 

A paper  summarizing  work  to  date  in  this  system  has  been  prepared  for  publica- 
tion in  the  Journal  of  the  Electrochemical  Society.  This  manuscript  s\immarizes  results 
described  in  many  of  our  earlier  reports  on  this  system. 

Recently,  Professor  Greenblatt  has  been  measuring  excitation  spectra  at  Rutgers 
University.  These  should  permit  a study  of  the  relative  importance  of  the  "dimers" 


Atlifi’s  ti  iiMt  i f''"'  1 w , _ ' .-jM 
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3+  3+ 

(R.  E.  "^1)2  NMR  studies,  as  compared  to  isolated  Yb  ions,  in 

contributing  to  the  efficiency  of  the  upconversion  process.  We  are  preparing  some 

3+  3+ 

mixed  Yb-Er  doped  samples  for  NMR  studies  to  see  if  mixed  pairs  (Yb  -Er  "Fj-) 

++  ++  . ^ 

are  formed  and  if  their  number  is  enhanced  by  the  substitution  of  Ca  for  Cd  in  the 
crystal. 

C.  Rare  Earth-Fluorine  Interactions 

Previous  work  in  this  area  has  concerned  ligand  hyperfine  interactions  between 
rare  earth  ions  and  fluoride  ions  in  CdF^.  One  result  has  been  the  detection  of  "dimers" 
involving  rare  earth  ions,  as  outlined  in  the  previous  section.  The  following  is  an  ab- 
stract of  a paper  on  the  NMR  study  of  a single  crystal  of  ErF^.  The  crystal  was 
grown  by  Czochralski  pulling  of  an  ErF^  melt  in  an  HF  atmosphere  by  M.  Robinson  of 
the  Hughes  Research  Laboratories  and  we  acknowledge  his  contribution  with  gratitude. 

1 Q 

The  F NMR  of  ErF^  and  LiErF^  powders  and  a single  crystal  of  ErF^  has  been 
measured.  For  ErF^  the  shift  tensor  was  determined  for  the  two  different  fluoride 
ion  sites  in  the  crystal  lattice  and  the  orientation  of  the  principal  axes  in  each  site  were 
found.  Further,  the  temperature  dep6ndence  (200-400°K)  of  the  isotropic  and  traceless 
shift  tensors  were  separately  determined  for  each  site.  The  traceless  tensor  agrees 
with  that  calculated  from  a dipolar  model  within  10%  and  its  temperature  dependence  is 
identical  with  that  reported  for  the  paramagnetic  susceptibility  of  ErF^.  The  isotropic 
shifts  for  the  two  sites  are  found  to  have  different  temperature  dependences  showing 
that  theories  of  the  isotropic  shift  which  assume  one  hyperfine  parameter  for  all  states 
of  the  J manifold  cannot  be  applied  to  this  system;  analysis  of  the  shift  using  recent 
theories  show  (1)  the  shift  contributions  of  more  than  one  nearest  neighbor  rare  earth 
ion  are  additive  and  (2)  both  the  covalent  and  polarization  mechanisms  of  spin  transfer 
must  make  comparable  contributions  to  account  for  the  results.  The  analysis  of  hyper- 
fine  parameters  reported  for  the  isoelectronic  Ho  in  CaF^  arrives  at  a similar  con- 
clusion about  the  contributions  of  covalent  and  polarization  mechanisms  for  that  ion. 

It  is  shown  that  useful  information  about  the  shift  tensor  can  be  obtained  from 
powder  spectra  if  there  is  only  one  fluoride  ion  site,  as  in  the  case  for  LiErF^,  but  not 
from  the  rare  earth  trifluorides  which  have  two  different  sites  in  the  unit  cell. 

The  full  paper  has  been  submitted  to  the  Journal  of  Magnetic  Resonance. 

U.S.  Army  Research  Office 

DAAG- 29 -7 5-0-0096  E-  Banks 
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A SIMPLE  NEW  TECHNIQUE  FOR  THE  MEASUREMENT  OF  THE  ELASTIC  AND 
MAGNETOELASTIC  PROPERTIES  OF  FERRITES 

L.M.  Silber 

Because  of  fundamental  scientific  interest,  and  in  view  of  the  potential  applications 
of  ferrites  in  devices  such  as  variable  magnetoelastic  delay  lines  or  non-r ecipr ocal 
elastic  surface  wave  devices,  a number  of  techniques  have  been  developed  to  measure 
the  fundamental  magnetic  and  elastic  properties  which  determine  the  behavior  of  these 
devices.^  Among  the  most  important  of  these  properties,  when  considering  the  choice 
of  material  for  device  application,  are  the  magnetoelastic  coupling  coefficient  and  the 
elastic  losses.  While  the  techniques  previously  developed  can  yield  precise  results  for 
these  quantities,  these  techniques  often  require  elaborate  specimen  preparation,  or 
construction  of  special  equipment.  We  propose  a simple  experimental  technique  using 
readily  available  equipment.  It  is  potentia;iy  capable  of  yielding  detailed  quantitative 
information,  and  should  be  most  convenient  as  a quick  qualitative  method  of  choosing 
materials  having  sufficiently  interesting  properties  to  warrant  investigation  by  more 
elaborate  techniques. 

2 

The  experiment  is  essentially  a modification  of  a technique  developed  by  LeCraw. 

A small  sphere  of  ferrite  is  placed  in  the  Inductance  of  a marginal  oscillator.  (In  our 
experiments  a commercial  nuclear  magnetic  resonance  gaussmeter  was  employed,  with 
the  hydrogen -lithium  sample  replaced  by  the  ferrite  sphere  resting  in  the  bottom  of  a 
thin  vyalled  glass  tube.)  A biasing  field  sufficient  to  magnetize  the  sample  to  saturation 
as  a single  domain  is  applied  (i,  e.  , to  remove  domain  walls).  The  frequency  of  the 
marginal  oscillator  is  swept  in  the  usual  manner  until  a resonance  is  observed.  At  the 
frequency  corresponding  to  an  acoustic  resonance  of  the  sphere,  magnetoelastic  coupling 
will  excite  acoustic  oscillations  in  the  sphere.  The  absorption  at  resonance  is  deter- 
mined by  the  geometry  of  the  system,  the  magnitude  of  the  magnetoelastic  constants, 
and  the  acoustic  losses.  An  equivalent  circuit  is  shown  in  Figure  1.  L represents  the 


MARGINAL 

OSCILLATOR 


Fig.  1.  Equivalent  circviit  for  determining 
magnetoelastic  properties  of 
ferrite  sample. 
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inductance  of  the  oscillator.  The  mutual  inductance  M is  determined  by  the  geometry 
of  the  coil  and  sample,  and  the  appropriate  magnetoelastic  coupling  coefficients,  t , 
c,  and  R reflect  the  elastic  properties  of  the  ferrite  sample.  Assuming  the  elastic 
losses  are  reasonably  small,  the  frequency  of  resonance  is  determined  by  the  elastic 
constants,  while  the  Q cf  the  resonance  is  determined  by  the  acoustic  losses  Observa- 
tion of  the  strength  of  the  absorption,  and  absorption  line  shape  allows  one  to  estimate 
relative  values  of  magnetoelastic  coupling  coefficients  and  elastic  losses. 

To  make  the  results  quantitative  one  would  have  to  calculate  the  value  of  M.  This 
value  would  be  different  for  each  of  the  acoustical  modes,  because  the  elastic  displace- 
ment is  different  for  each  mode.  For  qualitative  results,  however,  one  can  use  a 
sample  with  known  magnetoelastic  coupling  coefficient,  such  as  yttrium  iron  garnet, 
as  a calibrating  standard.  If  the  acoustic  losses  are  sufficiently  low,  they  could  be 
measured  by  pulsing  the  marginal  oscillator  and  looking  at  the  decay  in  the  acoustic 
absorption. 

Qualitative  measurements  have  been  made  on  single-crystal  samples  of  yttrium 
iron  garnet,  gallium -substituted  YIG,  magnesium  ferrite,  and  a series  of  nickel-zinc 
ferrites,  all  at  room  temperature.  Samples  of  l/2mm  to  2mm  in  diameter  were  used. 
In  each  instance  it  was  possible  to  identify  several  of  the  acoustic  modes  of  resonance. 
The  results  are  consistent  with  those  obtained  by  other  techniques.  YIG  has  a small 
magnetoelastic  coupling  coefficient,  and  very  small  acoustic  losses.  The  magneto- 
elastic coupling  coefficient  of  nickel-zinc  ferrite  is  much  larger,  and  the  acoustic  losses 
much  larger.  The  magnesium  ferrite  has  magnetoelastic  coupling  coefficient  and  elastic 
losses  larger  than  YIG,  and  smaller  than  the  nickel-zinc  ferrites.  As  reported  by 
LeCraw,  one  observes  two  or  three  closely  spaced  resonances,  rather  than  a single 
resonance,  corresponding  to  each  mode.  They  are  typically  ^/2%  to  1%  apart.  The 
resonant  frequency  of  each  resonance  increases  slightly  with  magnetic  field,  approach- 
ing a limit  at  high  fields,  due  to  the  magnetoelastic  or  spin  wave  contribution  to  the 
elastic  modulus.  It  is  possible  to  calculate  the  appropriate  magnetoelastic  coupling 
coefficient  from  this  variation,  though  this  has  not  yet  been  done. 

An  attempt  was  made  to  observe  the  magnetoelastic  resonance  in  samples  of  cobalt 
ferrite.  This  has  been  unsuccessful  thus  far,  probably  because  of  the  poor  quality  of 
the  samples  of  cobalt  ferrite  available,  as  evidenced  by  observation  of  ferromagnetic 
resonance  in  these  samples.^ 

In  summary,  the  proposed  technique  offers  a convenient  method  of  obtaining  quali- 
tative results  with  readily  available  commercially-bviilt  equipment,  and  might  be  useful 
for  a preliminary  investigation  of  a large  class  of  materials  to  ascertain  if  more 
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elaborate  experiments  are  worth  undertaking.  Measurements  as  a function  of  tempera- 
t\ire  could  be  readily  made.  With  refinement  of  the  theoretical  analysis,  it  could  be 
developed  into  a quantitative  method  of  great  versatility  and  convenience. 

Joint  Services  Technical  Advisory  Committee 
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MAGNETOCONDUCTIVITY  MODULATION  OF  PERMALLOY  FILMS  BY  SUBSTRATE 
ELECTROSTRICTION 

P.  Mazumdar  and  H.  J.  Juretschke 

The  mechanism  developed  in  the  preceding  report/  whereby  a thin  metallic  film 
forming  a plate  of  a condenser  undergoes  a change  in  conductance  when  the  condenser 
is  charged,  as  a result  of  electrostrictive  deformation  of  the  dielectric,  applies  equal- 
ly well  to  ferromagnetic  metals.  In  addition,  however,  ferromagnetic  metals  offer 
another  coupling  of  strain  to  film  conductivity  via  the  film  magnetization  M,  which  leads 
to  some  new  effects. 

In  connection  with  metallic  field  effect  (MFE)  measurements  on  permalloy  films 

we  have  observed  a second  harmonic  response  that  is  sensitive  to  both  the  direction  and 

2 . 

the  magnitude  of  an  applied  magnetic  field  acting  in  the  film  plane.  Figure  1 shows 

o 

representative  data  obtained  on  a 540A  90-100  permalloy  film.  The  rms  conductance 

2 

per  (unit  surface  charge  density),  6E(2u),  consists  of  a magnetic  field  and  angle-inde- 
pendent term  (the  dashed  lines)  on  which  there  is  superposed  a signal  varying  roughly 
like  40  and  increasing  with  decreasing  magnetic  field  like  l/H. 


5 

0 


8Z(2«)[l0"^n'/(C/m‘ 


h = 2l.5 
H=570G 


e 


Fig.  1.  RMS  second  harmonic  conductance  of  a permalloy  film  on  mica  sub- 
jected to  a surface  charge  density  q sin  ut.  Crosses  are  experimental 
points.  The  solid  curves  are  the  fit  with  the  angular  function  F(0,h) 
of  Eq.  (4),  with  6 = -16°,  at  four  values  of  h. 
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This  second  harmonic  signal  is  known  to  be  proportional  to  film  thickness,  and 
hence  it  must  be  caused  by  a bulk  effect  throughout  the  thickness  of  the  thin  film  elec- 
trode, and  initiated  by  the  strain  transmitted  to  the  film  from  the  dielectric. 

The  well-known  dependence  of  the  conductivity  a of  ferromagnetic  metals  on  the 
angle  9^  which  the  magnetization  M makes  with  the  current,^ 

a = Oq  - Ac  cos  2 6^  (1) 

contains  an  explicit  20variation.  If  either  Cq  or  Ac  is  affected  by  isotropic  strain,  we 
obtain  largely  the  response  discussed  in  the  preceding  report,  with  at  most  also  a 20 
variatior.  The  origin  of  the  40  dependence  lies  in  the  effect  of  strain  on  the  angle  0 

m 

of  the  magnetization  in  a fixed  external  magnetic  field  H. 

In  a typical  magnetically  uniaxial  permalloy  film,  the  equilibrium  direction  of  M 

4 

in  a prescribed  H is  given  by 

HM  sin(0  - 0 ) = K sin  2(0  - 0 ) (2) 

where  0 and  0^  are  the  direction  of  H and  M in  the  plane  of  the  film,  0^  is  the  direction 

of  easy  magnetization,  and  K measures  the  anisotropy  energy.  Variations  of  K with 

strain  will,  using  Eq.  (2),  produce  variations  in  0 .If  we  assume  a strain  variation 

m 

of  K of  the  form 
6K  V 

TT=L^«i  (3) 

i 

then  for  a polycrystalline  permalloy  film  we  derive  a change  in  conductance  caused  by 
the  modulation  of  0^  given  by 

I sin  20  sin  2(0  - 0 ) 

0 E m ' m e' 

E 2 h cos(0  - 0 ) + 2 cos  2(0  - 9 ) ' 

m me 


2Ac 


{^1+^2  - 2 


1 


ri3)(Sii  + Sj2 


-S13) 


2e 


33 


(4) 


The  second  line  of  Eq.  (4)  predicts  the  overall  magnitude  of  the  effect  in  terms  of 
the  magnetoconductive  response,  the  effective  strain  coefficient  of  anisotropy  energy, 
and  the  strain  produced  by  a field  E = q/e^^  in  the  mica  dielectric.  The  first  line 
describes  the  angular  response  F(0,h)  using  the  reduced  field  h = HM/ZK  and  the  angle 
0.  For  given  0^  and  0,  0^  is  known  through  Equation  (2). 

The  angular  factor  F(0,h)  predicted  by  Eq.  (4)  has  the  prerequisite  40  variation, 
and  the  H dependence  of  its  denominator  reproduces  well  the  observed  behavior.  The 


168 


SOLID  STATE  AND  MATERIALS 


solid  curves  in  Fig.  1 represent  the  predictions,  at  four  different  magnetic  fields,  of 
Eq.  (4)  (slightly  modified  to  include  a magnetic  contribution  to  the  anisotropy  energy) 
based  on  a single  set  of  parameters  fixed  by  independent  measurements.  The  absolute 
magnitude  of  these  curves  leads  to  the  experimental  description 


6E  - 6E 


= -6.  7 F(0,h)q^ 


(5) 


In  order  to  compare  this  result  with  the  theory  of  Eq.  (4),  we  must  know  all  the 


factors  of  the  second  line  of  that  equation.  The  pertinent  properties  of  mica  are  in  the 
literature 


(Sji+ 


Sjj)  = 5.94 


10  ^^m^/newton 


£33  = 5.5 


10  ^ ^ farad/m 


(6) 


The  remaining  factors  were  obtained  by  direct  measurements  on  the  sample  in 
question.  From  the  observed  magnetoconductivity  relation  of  Eq.  (1)  we  obtain 


4.610-2 


(7) 


and  the  strain  coefficients  of  the  anisotropy  energy  were  measured  under  static  strains 
to  be 


P 


(8) 


Combining  the  results  of  Eqs.  (6),  (7)  and  (8),  the  predicted  numerical  factor  of  Eq.  (5) 
is  8.7.  Considering  the  niunber  of  interactions  involved  in  the  overall  description,  the 
agreement  between  experiment  and  theory  is  highly  satisfactory.  In  fact,  for  other 
samples  the  match  is  even  closer.  Furthermore,  the  magnetostrictive  properties  em- 
bodied in  Eq . (8)  may  change  sign.  We  have  verified  that  under  those  conditions  the 
experimentally  observed  0-dependence  also  reverses.  As  a final  check  of  the  proposed 
mechanism,  the  permalloy  samples  were  rigidly  glued  between  thick  glass  slides,  in 
order  to  suppress  the  lateral  strains  of  the  mica  substrate.  The  second  harmonic  signal 
decreased  by  more  than  80  percent  under  this  treatment. 


The  overall  modulation  of  the  direction  of  magnetization  represented  by  the  data 
of  Fig.  1 is  of  the  order  of  a second  of  arc,  and  is  obtained  by  strains  of  order  10 


These  small  variations  offer  a new  method  of  analyzing  the  magnetic  equilibrium  and 
the  possible  spatial  distribution  of  magnetic  anisotropy  energies  in  considerable  detail. 

In  low  fields,  the  coupling  between  magnetization  and  strain  demonstrated  here  can  be 
used  to  detect  small  changes  in  strain,  or  small  changes  in  the  direction  of  magnetization. 
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MODULATION  OF  THIN  FILM  CONDUCTIVITY  BY  SUBSTRATE  ELECTROSTRICTION 
D.  Lischner  and  H.J.  Juretschke 

In  the  metallic  field  effect  (MFE)  the  conductance  2 of  an  electrode  of  a parallel 

plate  condenser  is  altered  when  the  condenser  is  charged.^  Experiments  on  condensers 

fabricated  from  thin  sheets  of  muscovite  mica  with  very  thin  evaporated  electrodes 

give  a change  of  conductance  with  a.  c.  charging  that  has  components  at  both  the  driving 

2 

frequency  and  at  the  second  harmonic.  Recent  measurements  show  that  most  or  all  of 

3 

the  second  harmonic  signal  is  proportional  to  the  electrode  thickness,  for  very  thin 
electrodes,  indicating  that  this  part  of  the  response  is  not  a true  interface  effect.  This 
report  presents  evidence  that  in  most  MFE  experiments  this  second  harmonic  can  be 
traced  to  a conductivity  modulation  of  the  electrode  through  strains  induced  by  electro- 
strictive  deformation  of  the  dielectric. 

When  an  electric  field  E is  applied  to  mica  in  the  direction  (001)  normal  to  the 
cleavage  plane,  the  mica  develops  isotropic  strains  in  the  cleavage  plane  derivable 
from  the  Maxwell  stress  tensor  and  the  elastic  constants  of  mica  as 


®1  = 


2^®11  ®12  ■ ®13^  ^33^ 


(1) 


As  the  change  in  electrode  conductivity  is  normally  linear  in  the  strains,  the  E depend- 
ence of  these  strains  implies  a second  harmonic  response.  In  order  that  this  effect 
explain  the  observed  second  harmonic  it  must  have  the  correct  magnitude. 

The  linear  strain  dependence  of  the  conductivity  with  the  current  along  the  x- 

5 

direction  is  a function  of  all  six  strains 


7 Y..e. 

o ij  j 

j 


(2) 


where  the  tensor  is  defined  in  the  coordinate  system  having  x and  y in  the  electrode 
plane.  For  electrodes  of  epitaxial  silver  or  antimony  on  mica,  Eqs.  (I)  and  (2)  com- 
bine to  yield  an  observable  change  of  conductance  62 

2c, 


¥=  [^11+  ^12 


(3) 


'33 


where  the  c^^^  are  components  of  the  elastic  stiffness  tensor  of  the  electrode  in  the  s^lme 
coordinate  system. 

The  tensor  is  only  very  incompletely  known  in  most  metals,  if  at  all.  Further- 
more, its  dependence  on  electrode  thickness  or  other  sample  parameters  is  uncertain. 

It  therefore  becomes  necessary  to  determine  the  combination  of  the  Yj^j's  entering  into 
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Eq.  (3)  by  an  independent  measurement  on  the  same  samples  used  for  the  MFE. 

If  a substrate  bearing  a thin  film  is  bent  statically  into  an  arc  of  a circle,  the 
change  of  conductance  of  the  film  with  strain  depends  on  the  direction  of  the  plane  of  the 
circle  relative  to  the  current  direction;  however,  no  combination  of  such  measurements 
will  specify  Y , Y,,  and  y,  , separately.  Nevertheless,  it  turns  out  that  a major  ap- 
plied  strain  at  45  degrees  to  the  current,  or  the  sum  of  the  effects  at  0 and  90  to  the 
current,  precisely  produces  a 62  proportional  to  the  desired  combination  of  Y—  of 
Equation  (3). 

The  results  of  some  such  static  measurements  is  shown  in  the  first  column  of 
Table  I.  These  numbers  are  of  the  same  magnitude  as  those  in  the  literature  for  both 
bulk  and  thin  film  samples,^  but  differ  in  some  details.  The  second  column  in  Table  I 
gives  the  predicted  value  of  Eq.  (3),  using  the  strain  of  Eq.  (1)  (normalized  to  a surface 
change  density  of  1 coul/m  ) and  the  third  colximn  lists  the  experimentally  observed 
second  harmonic  response  of  the  MFE.  The  agreement  between  prediction  and  experi- 
ment is  very  satisfactory,  and  indicates  that  we  have  identified  most,  if  not  all,  of  the 
source  of  the  second  harmonic  contribution  of  the  MFE. 

TABLE  I.  Strain  dependence  of  conductivity,  and 
conductance  modulation  by  electro- 
striction  in  silver  and  antimony. 


2c  - 

^ll'*’  ^12  Cj2  ’*'13^ 

1 62  , 

2 7T 
6q 

[c/m^]‘^ 

predicted 

measured 

Sb 

-20.9 

-0.  77 

-0.  69 

Ag 

- 8.66 

-0.  32 

-0.  22 

As  an  additional  confirmation  of  the  proposed  mechanism,  the  MFE  sample  was 
rigidly  sandwiched  between  thick  glass  slides  which  effectively  suppressed  the  lateral 
strains  of  the  mica  and  its  film  electrodes.  Substantially  all  of  the  second  harmonic 
signal  could  be  eliminated  by  this  procedure  (although  the  first  harmonic  remained 
completely  unaffected). 

Beyond  successfully  identifying  a signal  which  probably  accompanies  all  MFE 
measurements,  the  observed  coupling  between  dielectric  deformation  and  electrode 
conductivity  of  a parallel  plate  condenser  has  other  possible  uses.  It  offers  a method 
, for  detecting  strains  of  the  order  of  10  or  less,  and  depending  on  whether  the  prop- 

erties of  the  electrode  or  the  dielectric  are  unknown,  it  can  be  used  to  obtain  informa - 
tion  on  Y of  the  electrode  material  or  to  determine  the  strains  of  the  dielectric.  This 

ij 
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last  case  is  of  particular  interest  where  one  looks  for  the  electrostrictive  behavior  of 

7 

the  dielectric  resulting  from  the  strain  dependence  of  its  dielectric  constants.  The 
agreements  of  Table  1 indicate  that  in  mica  this  contribution  can  be  neglected,  although 
in  other  dielectrics  it  can  add  substantially  to  the  magnitude  of  Equation  (1). 
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ON  THE  ORIGIN  OF  THE  METALLIC  FIELD  EFFECT 
H.J.  Juretschke,  D.  Lischner  and  P.  Mazumdar 

1 2 

In  the  two  preceding  reports,  ’ we  have  unambiguously  identified  the  second  har- 
monic signal  6E(2u)  observed  in  metallic  field  effect  (MFE)  measurements  as  a bulk 
effect  arising  from  the  influence  of  electric  field-induced  strains  of  the  substrate  on 
the  electrical  properties  of  the  metal.  This  third  report  summarizes  the  evidence  that 
the  first  harmonic  signal  &E(u),  a true  surface  effect,  results,  at  least  in  part,  from 
the  scattering  of  current  carriers  by  localized  interfacial  strains  linearly  proportional 
to  the  surface  change  density  q,  and  it  proposes  a model  to  account  for  this  mode  of 
interaction. 

The  evidence  for  the  role  of  strain  in  6E(u)  comes  from  several  sources: 

(a)  Angular  Dependence  of  6E(b)) in  Ferromagnetic  Metals:  When  a magnetic  field 
H in  the  plane  of  the  sample  makes  an  angle  0 with  respect  to  the  current,  as  discussed 
in  Ref.  2,  both  6E(2u)  and  6E(<>>)  are  functions  of  6 and  H.  As  shown  in  Fig.  1,  both 

3 

signals  have,  in  fact,  a nearly,  identical  form  of  the  0-  and  H-dependence.  Since  the 
second  harmonic  arises  from  strain  modulation  of  the  anisotropy  energy,  as  demonstrat- 
ed in  Ref,  2,  the  identical  functional  form  of  the  signal  in  6E(w)  argues  strongly  that 
here  we  are  again  observing  the  results  of  strain  modulation  of  the  magnetic  anisotropy 
energy,  in  this  case  of  the  surface  region  of  the  ferromagnetic  film. 

As  a matter  of  fact,  most  data  for  6X(u)and  6E(2u)  can  be  matched  with  a common 
function  F(0,h)  (see  Ref.  2)  indicating  that  magnetic  surface  and  volume  properties  of 
permalloy  films  are  not  very  different.  In  some  cases,  phase  shifts  between  the  first 
and  second  harmonic  responses  require  introducing  a small  modulation  of  the  surface 
anisotropy  direction,  but  the  overall  model  still  holds. 

(b)  Influence  of  Static  Strain  on  Ferromagnetic  MFE:  When  the  MFE  experiment 
is  carried  out  with  the  permalloy  film  under  large  static  strain,  not  only  does  the 
anisotropy  energy  change,  but  the  amplitudes  of  the  oscillations  of  6E(u)  and  6E(2u), 

3 

such  as  shown  in  Fig.  1,  also  change,  and  by  a common  factor.  This  effect  on  6E(2u) 
must  be  ascribed  to  a non-linear  dependence  of  the  anisotropy  energy  on  strain.  Since 
the  first  harmonic  6E(w)  follows  the  same  pattern,  it  must  also  be  connected  to  strain, 
and  most  likely  through  the  same  coupling. 

(c)  MFE  on  Ferroelectric  Substrates:  The  MFE  experiment  of  silver  on  mylar 

4 

substrates  gives  results  that  depend  on  the  polarization  state  of  the  mylar.  The  mylar 
can  be  polarized  in  either  direction  by  cooling  it  from  above  150°C  in  an  applied  electric 
field.^  In  going  from  the  nonpolarized  to  the  polarized  state  of  the  substrate,  6E(u)  al- 
ways changes  in  the  direction  of  becoming  more  negative,  regardless  of  the  direction 
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Fig.  1.  RMS  first  and  second  harmonic  conductance  changes  of  a permalloy 
film  on  mica  subjected  to  a surface  charge  density  q sin  wt,  as  a 
function  of  the  direction  of  an  applied  magnetic  field  in  the  sample 
plane.  The  curves,  a,b,c,  refer  to  three  fields:  H^=  45  gauss, 

Hjj=  120  gauss,  = 530  gauss.  For  reference,  the  magneto - 
resistivity  of  the  sample  as  a function  of  angle  is  also  shown. 

of  polarization.  Since  the  polarized  state  is  accompanied  by  atomic  displacements  and 
strains  that  are  the  same  for  both  directions  of  polarization,  the  state  of  strain  at  the 
bonded  metal-mylar  interface  is  expected  to  be  altered  by  a large  amount  under  the 
polarization  treatment.  Hence  changes  in  both  6E(u)  and  strain  are  related. 

(d)  The  Effect  of  Large  Static  Streiin  on  the  Non -ferromagnetic  MFE;  When  the 


-4  -3 

substrate  is  deformed  to  introduce  strains  of  the  order  10  to  10  at  the  interface, 
5Z((>>)  of  silver  or  antimony  on  mica  is  observed  to  change,  and,  in  fact,  it  often  moves 

4 

irreversibly  to  a new  value.  It  is  likely  that  under  such  large  strains  the  interface 
relieves  the  local  stresses  by  faults  and  dislocations,  and  consequently  the  state  of 
strain  at  the  undeformed  interface  is  permanently  altered  under  such  treatment. 

(e)  MFE  as  a Function  of  Temperature:  The  first  harmonic  of  silver,  gold,  or 


antimony  on  mica,  6E(u),  changes  reversibly  with  temperature  by  amounts  comparable 
to  or  larger  than  those  introduced  by  large  static  strains.^  Because  of  the  method  of 
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sample  preparation  at  high  temperature,  differential  thermal  expansion  of  sample  and 
substrate  always  produces  strains,  and  these  are  expected  to  be  most  severe  in  the 
region  of  the  interface. 

In  summary,  we  have  observed  changes  in  6E(w)  only  when  the  external  parameter 
influencing  the  system  can  be  tied  to  changes  in  interfacial  strain.  On  the  other  hand, 
as  described  in  Refs.  1 and  2,  a macroscopic  clamping  of  the  system,  while  successfully 
suppressing  6Z(2u)  has  very  little  or  no  effect  on  6E(u).  We  therefore  conclude  that 
any  interfacial  strain  involved  in  the  scattering  giving  rise  to  6S(u)  must  be  either 
normal  to  the  interface,  since  such  local  strains  cannot  be  clamped  externally,  or  it 
could  also  be  a microscopic  strain  distribution  in  the  plane  of  the  interface  having  a 
vanishing  macroscopic  average, 

A model  having  these  features  can  be  developed  along  the  lines  recently  proposed 
to  understand  the  interaction  of  freely  falling  electrons  with  conducting  walls  in  a 

*7 

gravitational  field.  If  a charge  Q occupies  a fixed  site  at  the  metal-insulator  interface, 
its  electric  field  will  exert  forces  on  the  unscreened  ions  in  its  immediate  neighborhood 
that  have  two  properties: 

(1)  The  forces  will  change  direction  with  the  sign  of  Q 

(2)  The  net  force  in  the  plane  of  the  interface  is  zero,  but  the  net  force  in  the 
metal  normal  to  the  interface  is  non -vanishing. 

As  a result  of  these  localized  forces  the  material  in  the  immediate  neighborhood  of  Q 
will  deform  elastically.  If  each  of  the  charges  deposited  at  the  interface  behaves  in- 
dependently, we  have  a model  in  which  these  charges  alter  the  local  state  of  strain  with 
a distribution  proportional  to  the  surface  charge  density  q.  These  strain  fields  will 
scatter  current  carriers,  apart  from  the  direct  scattering  by  each  screened  Q. 

In  the  bulk,  the  predominant  effect  of  local  strain  on  the  scattering  of  current 

g 

carriers  arises  from  a screened  deformation  potential  linear  in  the  strains.  If  the  sur- 
face region  is  considered  to  be  naturally  strained,  any  additional  surface  charge  density- 
induced  strain  will  cause  a change  in  surface  scattering  linear  in  this  strain,  and  hence 
is  proportional  to  both  q,  and  to  the  original  strain. 

Thus  we  obtain  a model  for  the  MFE  which  contains  the  major  ingredients  of  the 
observed  effects.  It  is  similar  to  one  originally  proposed  in  terms  of  direct  scattering 

q 

by  charge  patches,^  except  that  now  the  scattering  is  mediated  through  local  interfacial 
strains,  as  required  by  the  observed  strain  dependence  of  the  various  phenomena.  A 
prediction  of  the  order  of  magnitude  of  the  scattering  expected  in  this  model  is  difficult, 
since  so  little  is  known  about  either  the  actual  state  of  strain  in  the  surface  or  the  scat- 
tering cross  section  of  the  strained  region.  However,  from  the  known  magnitude  of  the 


T 
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pressure  dependence  of  the  work  fiinction  of  metals/^  combined  with  a thermodynamic 
identity,  the  interfacial  strains  e are  related  to  the  surface  charge  density  by  e=  10 
(with  q in  Coul/m  ).  If  we  then  assume  that  scattering  with  surface  strains  is  similar 

“1  z 

to  that  observed  in  bulk,  we  obtain  the  estimate  for  silver  6E(u)  = -10  /(ohm-C/m  ), 

which  is  compatible  with  observation.  Alternatively,  the  model  of  Ref.  9,  taking  into 

account  that  its  quadratic  term  is  masked  by  the  electrostrictive  signal,^  requires  a 

2 - 1 

coupling  between  surface  conductance  and  strain  e of  the  type  62^  = -lOe  (ohm) 

A relation  of  such  magnitude  is  not  inconsistent  with  the  scattering  strength  of  locally 

g 

strained  regions.  Interestingly,  this  model  also  requires  a static  strain  in  the  surface 
_2 

of  order  10  or  larger,  which  suggests  that  such  strain  can  account  for  a significant 
portion  of  the  overall  surface  scattering  of  most  metal  surfaces. 
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THE  INTERACTION  OF  INTENSE  LASER  RADIATION  WITH  METAL  SURFACES 
M.C.  Newstein  and  N.  Solimene 

The  interaction  of  intense  optical  radiation  with  condensed  nnatter  is  an  active  i i 

field  of  study  because  of  the  variety  of  technical  applications  of  the  resultant  phenomena 

as  well  as  because  of  the  scientific  interest  in  exploring  previously  unavailable  ranges 

of  the  interaction  parameters.  At  intensities  of  10^-10^  watts/cm ^ the  applications  in- 

12  3 

dude  heating,  deformation,  drilling,  cutting,  welding,  etc.  of  materials.  ' ' At 
8 2 

about  10  watts/cm  plasma  formation  occurs  and  there  is  applicational  and  diagnostic 
interest  in  laser  produced  plasmas  as  sources  of  electromagnetic  radiation  from  the 
infrared  to  the  X-ray  region,  as  well  as  of  electrons  and  highly  ionized  atoms.^’  ^ 

Applications  to  laser  fusion  become  possible  at  power  densities  of  10  watts/cm  and 

u-  u 7,8 
higher. 

We  have  initiated  a theoretical  study  of  the  interaction  with  metal  surfaces  of  short 
optical  pulses  with  intensities  in  the  range  from  10  to  10^  watts/cm  . We  are  model- 
ling experimental  conditions  which  are  such  as  to  allow  for  the  study  of  physical  phenom- 
ena not  fully  treated  in  previous  investigations.  In  contrast  to  most  of  the  previous 
material  processing  studies,  the  pulse  durations  are  short  enough  that  equalization  of 
the  electron  and  lattice  temperatures  may  not  occur.  Furthermore,  there  may  be  signi- 
ficant variations  of  optical  properties  over  the  effective  skin  depth,  hence  the  Fresnel- 
formula  does  not  properly  describe  the  absorption  and  reflection  properties  of  the  sur- 
face, rather  this  is  obtained  from  a self  consistent  solution  of  the  field-matter  equations 
for  the  system.  On  the  other  hand,  in  contrast  to  most  of  the  fusion  studies,  our  pulses 
are  weak  enough  that  within  times  of  experimental  interest  the  material  remains  in  a 
condensed  phase  and  the  thermal  parameters  are  those  descriptive  of  the  degenerate 
electron  gas  in  metals. 

We  have  constructed  a formulation  of  the  problem  which  treats  separately  many 
of  the  complex-simultaneously  occurring  physical  phenomena  in  order  better  to  evaluate 
their  individual  physical  significance.  The  first  class  of  problems  studied  eliminated 
the  effect  of  thermal  diffusion  and  concentrated  on  the  development  of  nximerical  methods 
for  obtaining  the  reflection  and  absorption  coefficients.  We  assume  variation  in  a one- 
spatial  dimension  (plane  wave  input).  The  metal  is  assumed  to  occupy  the  space  described 
by  z > 0.  It  is  characterized  by  the  conduction  electron  plasma  frequency,  Up,  electron 
collision  frequency  for  momentum  exchange,  v , and  for  energy  exchange,  v-.,  as  well 
as  the  electron  and  lattice  temperatures,  T and  T.  . The  collision  frequencies  are 
sums  of  temperature  independent  terms  (due  to  the  effects  of  extrinsic  lattice  imper- 
fections) as  well  as  temperature  dependent  terms  (due  to  electron -phonon  collisions). 

In  general,  these  parameters  are  functions  of  space  and  time,  some  of  this  dependence 
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is  via  the  local  electric  field  strength.  We  asstime  that  the  field  (of  frequency  a>)  within 
the  metal  (z  > 0)  is  of  the  form 


E = Re  d(z,t)  e 


-iwt 


(1) 


H = Re  J#(z,t)e‘^“‘  (2) 

where  the  time  dependence  of  the  complex  envelopes  <S  and  Mis  slow  compared  to  an 
optical  period  2ir/u.  With  this  approximation,  and  further  assumptions,  to  be  discus- 
sed, the  dimensionless  form  of  the  field-matter  equations  becomes: 


n 

9M 

dz 


= -1  wn 


2 . . i/w 

n = 1 + — — 

V - iw 

= KT  (z,t) 
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(3) 
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(7) 


(8) 


(9) 


In  the  above,  (S  is  measured  in  units  of  (]  e j/upmc)  ^ in  (e q | e l/cj^mc^) 

in  <i)  . t in  l/w  and  z in  c/w  . The  electron,  lattice  and  Fermi  temperatures  T_,  T,  , 
p ' p p e Ij 

and  Tp  are  measured  in  units  of  the  melting  temperature  Tj^.  Equations  (3)-(5)  are 
the  field  equations  with  the  effect  of  the  medium  described  in  terms  of  the  complex  index 
Equation  (6)  describes  the  assumption  that  the  momentum -exchange  collision 
frequency  is  proportional  to  the  lattice  temperature.  Equation  (7)  gives  the  rate  of 
change  of  energy  per  conduction  electron  due  to  the  competition  between  heating  by  the 
applied  field  (first  term)  and  relaxation  to  the  lattice  temperature  (second  term).  A 
phenomological  treatment  of  the  relaxation  process  in  the  kinetic  equations  leads  to  this 
form.  Equation  (8)  describes  the  lattice  heating  via  electron -lattice  energy  exchange 
collisions  and  Eq.  (9)  is  a simple  expression  for  the  electron  energy  as  a function  of 
electron  temperature  which  represents  the  behavior  for  temperatures  less  than  and 
comparable  to  the  Fermi  temperature  in  a form  suitable  for  numerical  calculations. 
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The  reflection  coefficient  is  determined  by  the  surface  value  (z  = 0)  of  the  admit- 
tance Y,  given  by 


Y(z.t)  = - f ^ 
This  equation  implies: 


(10) 


Siz.t)  = #{0,t)e^“  f dz'Y(z'.t) 

and  Y satisfies 

9Y  . ^2  ^ . T, 

= -xwY  + iwn  (2,t) 

with  the  boundary  conditions 


(11) 


(12) 


Y(z=  oc,  t)  = n(«),  t)  = 0)  (13) 

When  Eqs.  (12)  and  (13)  are  solved  for  Y(z,t),  the  complex  field  reflection  coefficient 
at  each  time  t is  obtained  from: 


r,,..  _ 1 - Y(0,t) 

~ r+  Y(o,'ty 


(14) 


We  report  below  some  results  of  this  analysis  for  the  action  of  5000A  laser  radia- 

6 7 8 2 

tion  on  an  aluminum  target  with  incident  intensities  of  10  , 10  , and  10  watts/cm  . 

It  should  be  emphasized  that  we  have  excluded  the  effects  of  thermal  diffusion  for  the 
reason  previously  given.  For  a melting  temperature  of  933°K,  the  ratio  of  the  Fermi 
to  the  melting  temperature  is  Tp  = 857.  The  ratio  of  the  electron-energy  relaxation 
time  to  momentum  relaxation  time  is  not  known.  For  the  purposes  of  the  present  cal- 
culation we  have  assumed  Vp/ Vp  = 10’^,  this  is  probably  an  underestimate. 

In  Fig.  1 we  have  plotted  the  surface  values  of  the  electron  and  lattice  temperature 
(relative  to  the  melting  temperature)  as  well  as  the  power  reflection  coefficient  |r]^ 
as  functions  of  time.  Common  features  of  all  three  cases  are  the  rapid  rise  and  then 
decay  of  the  electron  temperature  toward  equilibrium  with  the  more  slowly  rising  lattice 
temperature,  and  the  relatively  small  variations  in  the  power  reflection  coefficients. 

A main  effect  of  thermal  diffusion  will  be  to  decrease  the  local  rate  of  increase 
of  the  electron  temperature.  This  part  of  the  study  is  being  implemented. 
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T, 

|R|*  V 


Fig.  1.  Illustrating  the  temporal  variation  of  electron  temperature,  T , 
lattice  temperature,  Tj_^,  and  intensity  reflection  coefficient, 

|r|^  as  functions  of  time  at  the  surface  of  an  aliuninum  target 
irradiated  with  varying  intensities  of  5000  A laser  radiation. 

The  effects  of  thermal  diffusion  have  been  excluded  from  this 
study  in  order  to  separately  evaluate  their  importance. 
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SELF-FOCUSING  OF  COHERENT  PULSES 
M.C.  Newstein  and  F.  Mattar 

1 Z 3 

The  physical  mechanisms  responsible  for  coherent  self  focusing  ’ ’ can  be  elu- 
cidated by  following  the  propagation  of  the  transverse  energy  flow  correlated  with  the 
corresponding  values  of  the  field  amplitude.  The  electric  field  E is  given  by 

E = Re  5 exp(i  [wt  - kz])  ( 1 ) 

where  the  complex  amplitude,  S is  represented  in  terms  of  the  magnitude  A and  phase 
4>  by 


S = A exp(i(|>)  ^ 

The  longitudinal  component  of  the  energy  current  vector  J is 

J = A^ 
z 


and  the  transverse  partis  given  by 


(2) 

(3) 


J 


(4) 


In  Fig.  1 the  field  amplitude  is  plotted  versus  the  retarded  time  for  three  stages 
of  the  propagation  process:  (a)  the  reshaping  region;  (b)  the  build-up  region;  and  (c) 
the  focal  region.  The  transverse  energy  current  is  plotted  versus  the  retarded  time 
for  the  same  three  distances  in  Figures  1(d),  (e),  and  (f).  In  each  case  the  plots  are 
given  for  several  values  of  the  transverse  coordinate,  p.  Positive  values  of  the  trans- 
verse energy  flow  correspond  to  outward  flow,  and  negative  values  to  inward.  Figure 
1 clearly  illustrates  the  following  features  of  the  self-focusing  process.  In  the  earliest 
stages  of  the  propagation  (Figs,  (a)  and  (d))  the  near  axis  energy  current  is  outward  for 
most  of  the  pulse  time,  but  becomes  inward  (self-focusing)  toward  the  rear  (t  2.4). 
For  this  value  of  t the  field  amplitude  (Fig.  1(a))  is  already  past  its  peak  and  has  a 
small  value.  As  we  proceed  into  the  reshaping  region  (Figs,  (b)  and  (e))  the  near  axis 
peak  amplitude  moves  back  in  time  (corresponding  to  the  fact  that  the  group  velocity  is 
less  than  c/n)  while  the  temporal  location  of  the  change  from  focusing  to  defocusing 
energy  flow  remains  the  same.  This  leads  to  a large  increase  in  the  value  of  the  trans- 
verse energy  flow.  In  Fig.  1(c)  and  (f)  (in  the  focal  plane)  the  peak  amplitude  occurs 
at  T = 2.4.  The  energy  current  flow  in  the  earlier  stages  of  the  pulse  is  now  outgoing 
corresponding  to  power  which  has  already  focused  and  is  now  diverging. 

The  results  of  the  earlier  stage  (a)  are  in  quantitative  agreement  with  the  analytic 
predictions  of  the  perturbation  theory  presented  in  the  last  report.^ 
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Fig.  1.  The  field  amplitude  (a),(b),(c)  and  the  transverse  energy  current  (d),(e),(f)  for 
several  radii  versus  the  regarded  time  for  three  stages  of  the  propagation:  the 
reshaping  region,  the  build-up  region  and  the  focal  region  (as  a function  of  the 
transverse  coordinate). 
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Experimental  verification  of  the  phenomenon  of  coherent  self  focusing  has  recently 

been  reported."^’  ^ 
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ANALYSIS  OF  STIMULATED  BRILLOUIN  SCATTERING  IN  PLASMAS 
E.S.  Cassedy 


It  is  well  known  that  intense  laser  light  can  cause  stimulated  Brillouin  scattering 
(SBS)  in  a variety  of  transparent  media,  including  sub -critical  plasmas.  In  the  case 
of  plasmas,  the  coupling  results  from  the  quiver  motion  of  electrons,  as  driven  by  the 
laser  electric  field.  A phase-synchronous  coupling  at  three  frequencies  then  evolves 
through  the  mechanism  of  V xB  forces  in  the  plasma,  resulting  in  the  generation  of 
density  variations  of  electrons  and  ions  (the  ion-acoustic  wave)  and  the  generation  of  a 
light  wave  which  is  frequency  down-shifted  from  the  pumping  laser  frequency.  The 
phase-synchronous  coupling  is  of  the  same  parametric  type  as  that  used  to  explain  the 
parametric -decay  interaction  (PDI),  only  the  down-shifted  light  wave  in  the  SBS  process 
replaces  the  down-shifted  Langmuir  wave  of  the  PDI. 

Recently  it  has  been  suggested^  that  a purely -growing  instability  exists  in  con- 
junction with  SBS,  developing  colinearly  along  the  axis  of  the  laser  beam,  when  the 
plasma  is  illuminated  at  near-critical  densities.  If  this  purely -growing  instability  could 
be  viewed  as  the  analog  of  the  oscillating  two-stream  instability  (OTSI),  then  we 
could  conclude  that  it  exists  as  part  of  a four -frequency  interaction  involving  the  up- 
shifted  (or  anti-Stokes)  lightwave,  in  addition  to  the  down-shifted  (Stokes)  light,  the 

3 

ion-acoustic  wave  and  the  laser-pump  light  wave.  Drake,  et  al.  have  considered  a 
formulation  including  up-shifted  and  down -shifted  light  waves,  but  their  results  indicate 
a purely -growing  instability  transverse  to  the  (laser)  pump  beam  leading  to  a modula- 
tional  instability  and  self-filamentation  of  the  laser  beam.  The  present  formulation,  by 
contrast,  is  colinear  where  the  instability  would  develop  along  the  pump  beam  axis  and 
therefore  constitute  a four -frequency  interaction  analogous  to  the  OTSI. 


Such  a formulation  for  the  four -frequency  parametric  coupling  of  light  waves  and 

4 

the  ion-acoustic  wave  has  been  derived  elsewhere  and  need  not  be  repeated  here.  In 
the  Fourier-LaPlace-transformed  space,  that  formulation  could  be  written  in  the  identi- 
cal form  as  done  previously^  for  the  PDI-OTSl  case  and  the  dispersion  relation  expressed 
in  the  form : 


1 

0 


1 

D. 


D 


+ 1 


= 0 


(1) 


where  in  this 


case,  for  a traveling -wave  pump  field  E(x,t)  = EQCOs(a)Qt  - ^qX)  we  have: 


S.\2 
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°±i  = 


w - i V (w  ± 

pe  e 


v] 


with 


K=WpikVo/<7T  , VQ=eEp/ 


m w„ 
e 0 


u w = plasma  electron  and  ion  resonance  frequencies,  respectively 
pe’  pi 


^e’  ^i 


= electron  and  ion  damping  factors 


c = velocity  of  light. 

A.  The  Dispersion  Relation 

The  dispersion  relation  of  Eq.  (1)  may  be  used  to  investigate  the  four -frequency 


parametric  interaction  of  light  waves  and  the  ion-acoustic  wave.  Proceeding  again  as 
in  the  previous  casef  we  search  for  roots  to  Eq.  (1)  utiUzing  Newton's  Method  on  the 
digital  computer  and  display  these  solutions  as  conformal  mappings  of  u(k)  on  the  com- 
plex-k  plane.  The  mapping  for  an  SBS  interaction  is  shown  in  Fifj.  I for  parameters 
corresponding  to  a laser-plasma  pumped  at  99.  5%  of  the  critical  frequency.  The  para- 
meters are  all  realistic  with  respect  to  an  experimental  plasma,  with  the  exception  of 
the  damping  terms  (v.  and  v^)  which  have  been  taken  as  zero.  The  pum^ping  fie^ld 
strength  (E^)  corresponds  to  an  incident  laser  intensity  of  about  3x10  W/cm  . 

The  Riemann  sheet  shown  is  identified  with  the  ion -acoustic  mode  for  this  SBS 
interaction,  since  the  ion-acoustic  wave  is  the  basic  mode  of  the  interaction.  Absolute 
instability  is  indicated  by  the  existence  of  a saddle -point  in  the  mapping.  This  is  to  be 
expected  since  the  damping  has  been  taken  as  zero  and  we  must  therefore  be  above  the 
absolute -instability  threshold^  for  this  backward-wave  parametric  interaction. 

These  computed  results  for  the  SBS  may  be  compared  tb  the  results  of  the  coupled- 
mode approximation,  as  appUed  to  this  interaction.  A quadratic  dispersion  relation  for 
the  coupled-mode  approximation  has  previously  been  derived  for  generic  parametric 
interactions  with  finite  pump  wavelength  (i.e.  , k^  > 0)  (Ref.  6)  and  a resonant  cut-ofi 
propagation  characteristic  for  one  of  the  coupled  modes.  The  application  of  these 
generic  coupled-mode  results  to  the  case  of  SBS  in  a sub -critical  plasma  yields  the 
following  quadratic  approximation  dispersion  relation: 

2 A^ 

(u  - S^K)(oj  + Vg^  = -(S^+  Vg) 

where  the  pertubation  variables^*  ^ are: 


(2) 
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Fig.  1.  Stimulated  Biillouin  scattering  - dispersion  function. 
Parameters 

5 7 

V = V = 0,  S.  = 2x10  m/sec.,  S = 1.5x10  m/sec. 

1 e 1 ' ’ e ' 

m^^  =3.4x10  Kg,  m^  = 9.1x10  Kg 

u.=  1.80x  lO^^sec."^,  o)  = 1 . 79  x 10^^  sec.*^ 

0 pe 

Eq=  1. 5 X 10^^ volt/m 


Derived  Factors  (quadratic  approximation) 

• SBS  , ,-6  -1  SBS  . ,_12 

kj  = 1.25x10  m , co^  = 0.25x10  sec. 

k = (1.25-i  3.53)xl0^m'^ 
sp 

u)(k  )=(0.25-i  1. 42)  X 10^^  sec.  ^ 
sp 

Estimated  Saddle  Point 

k = (2.2  - i 2.0)x  lO^m"^ 
sp 

u)(k  ) = (0.  60 -i  0.  57)x  lo'^sec.'^ 
sp 


-1 


Legend  for  Contours: 


Im  wx  10  sec. 
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r i,SBS 

k = k - kj 


b 

The  interaction  wavenumber  and  frequency  are: 


''>=0  - * V=f  "oe  ■ ”0>  - ' V . . 

1 


SBS  _ c e 

“l  - ®i 


SBS  . ^ \ 
I ^2 


6,7  . . 

the  branch-point  separation  is. 

r 

and  the  local  group  velocity^’ ^ is: 

c^ko  - kf®S) 


This  approximate  dispersion  relation  yields  a saddle-point  of  the  following  parameters: 

k = (1. 25  - i 3.  53)x  lO^m  ^ 
sp 

12  -1 

u(k  ) = (0.25  - i l.42)x  10  sec. 
sp' 

for  the  case  calculated  above,  which  may  be  compared  with  the  saddle  point  estimated 
from  the  contour  plot  of  Fig.  1: 

k s (2.  2 - i 2.  0)  X lO^m  ^ 
sp 

12  -1 

w(k  ) as(0.60  - i 0.57)xl0  sec. 
sp' 

Thus,  we  can  observe  that  the  inclusion  of  the  anti -Stokes  light  component  in  the  dis- 
persion relation  results  in  a quantitative  effect  on  the  frequency,  growth  rate  and  wave- 
length of  the  ion-wave  in  this  SBS  interaction  in  a near-critical  (<0^^  = 0.995«o)  plasma, 
but  that  the  essential  qualitative  features  for  absolute  instability  are  maintained. 

The  branch  cut  shown  in  the  vicinity  of  the  Re-k  axis  on  Fig.  1 indicates  the  con- 
nection  from  the  ion -acoustic  sheet  to  one  of  the  neighboring  sheets  corresponding  to 
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the  Stokes  or  anti -Stokes  shifted  light  waves.  That  is,  on  those  sheets  the  dispersion 
function  would  be  closely  approximated  by  either  of  the  following: 

- ,/u)  + c^(k  - k-)^ 

/ 0 V ep  ' 0' 

° rz — z z 

-"0*V"ep+' 

in  any  region  where  parametric  interaction  was  not  manifested.  These  dispersion  func- 
tions are  of  a similar  form  as  those  discussed  for  the  PDl-OTSI,  with  the  velocity  of 
light  (c)  here  replacing  the  electron  thermal  velocity  (S^).  In  this  case,  unlike  the 
previous,  however,  there  are  wavenumber  shifts  (k  T k^)  involved  concurrently  with 
the  frequency  shifts  (w  T co^)  for  the  Stokes  and  anti-Stokes  light  components,  respective- 
ly. These  sheets  will  henceforth  be  referred  to  as  "c -sheets.  " 

An  investigation  of  the  complete  dispersion  relation  of  Eq.  (1)  on  the  c -sheets 
displays  no  solutions  which  suggest  instabilities.  That  is,  no  saddle-point  regions  of 
the  type  analogous  to  that  found  for  the  OTSI  were  found  for  the  u(k)  conformal  mappings 
on  the  c-sheets  as  solutions  of  Equation  (1),  Instead,  all  digital  computer  solutions 
showed  solutions  closely  approximating  those  values  calculated  by  the  uncoupled  c -sheet 
solutions  (Equation  (3)).  This  suggests,  of  course,  that  no  interaction  between  up  and 
down-shifted  light  waves  exists  for  this  formulation  and  would  lead  us  to  conclude  that 
no  purely -growing  instability  for  this  case,  in  analogy  to  the  OTSI,  exists. 

Confirmation  of  the  absence  of  a purely -growing  instability  in  the  presence  of  SBS 

8 * 
has  been  carried  out  by  analysis  of  a two-mode  approximation  similar  to  that  carried 

out  previously  for  other  parametric  interactions.^'  ^ It  should  be  noted,  however,  that 

this  absence  of  an  interaction  holds  only  for  a traveling -wave  pump  field  (see  Equation 

(1)).  If,  on  the  other  hand,  a standing-wave  pump  field  is  assumed,  then  an  absolute 

8 9 

instability  of  the  purely-growing  type  is  indicated.  ' 
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OBSERVATION  OF  WAVE  PACKET  BIFURCATION  IN  A MAGNETO -PLASMA  COLUMN 
E.  E.  Kunhardt  and  B . R.  Cheo 

A,  Background  and  Experimental  Setup 

The  bifurcation  of  wavepackets  has  been  predicted  by  modulation  theory  of  non- 
1 2 

linear  dispersive  waves.  ' We  report  here  the  first  observation  of  this  effect  using 

a magneto-plasma  column  surrounded  by  a conducting  waveguide.  Using  Maxwell's 

equations  and  a cold  plasma  model,  it  can  be  shown  that  for  weak  axial  magnetic  field 

the  normalized  wavepacket  amplitude  a(z,t)  of  the  lowest  propagating  plasma  mode 

(TM.,)  excited  by  an  axial  impulse  electric  field  at  a point  along  the  column,  is  govern- 
01  3 
ed  by  the  Korteweg -deVries  equation: 

8 a + v_^  8.a  - P8^a  + (w  b)"^  a 8.a  =0  (1) 

zOt  t c t 

where  the  longitudinal  component  of  electric  field  is  proportional  to  8 a,  u = cyclotron 

2 z c 

frequency,  b = column  radius,  and  8 = l/2v_u  . v-,  the  phase  velocity  of  small  oscil- 

V C U o o 1 /o 

lations  in  the  long  wavelength  limit,  is  given  by:'^  v 
non-linear  dispersion  relation  is  given  by:^ 

k(w)  » kpCw)  + (2) 

with  kQ(w)  = “/vq  + and  k^iw)  = -l/24gWpb  u).  Since  k^  < 0,  the  modulation  equa- 
tions are  hyperbolic,  and  subject  to  the  bifurcation  predicted;  i.e. , there  will  be  two 
real  characteristics  and  two  group  velocities: 

V « kl  ^ ± a/2bu)  kl^  (3) 

g 0 ' p 0 

The  experimental  setup  is  schematically  shown  in  Fig.  1(a).  The  wavepackets 
are  excited  by  a homemade  device  called  a bouncing  ball  generator  (BBC)  as  used  by 

c 

Pleshko  and  Palocz  in  their  experiments  to  observe  the  Brillouin  and  Sommerfeld  pre- 
cursors. Here  a modified  version  is  used  which  is  capable  of  producing  pulses  with 
peak  voltages  of  3.  2 KV  (at  50£2)  with  a duration  approximately  400ps.  This  high  inten- 
sity ultra-short  pulse  is  applied  axially  along  the  plasma  column  through  the  launcher 
at  the  input  point.  The  initial  disturbance  caused  by  the  impulsive  field  evolves  into  a 
wavepacket  as  it  propagates  do'vn  the  column.  The  plasma  is  the  positive  column  of  an 
argon  discharge  85  cm  long  with  an  axial  B field  up  to  1. 2 KG.  The  plasma  background 
neutral  density  “3.3xl0^^cm  T^  - 2.5eV  - 17MHz,  f^  - 1.3GHzandf^  » 

0.  63 GHz.  The  wave  launcher  (Fig.  1(b))  is  a 50n  parallel  plate  transmission  line  taper- 
ed to  match  into  a coaxial  line  with  better  than  -30 dB  reflection  loss.  The  plates  are 
separated  by  about  1 cm.  Hence,  when  a 3 KV  pulse  is  applied  to  the  coaxial  line  with 


= b u)  u>  /2.  405  (u  + w ) ' . 
0 pc  ' p c' 
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(a)  EXPERIMENTAL  SETUP 


Fig.  1. 

about  4dB  loss,  a 1. 9kV/cm  electric  field  would  appear  between  the  plates,  A hole  is 
bored  in  each  plate  and  the  plasma  tube  of  radius  0.  66  cm  with  the  conducting  wave- 
guide is  inserted  perpendicular  to  the  plates  to  allow  the  applied  impulsive  field  in  the 
axial  direction  shown  in  Fig.  1(b).  The  polarity  of  the  field  cannot  be  reversed  because 
of  the  ground  loop  requirements  of  equipment  used  in  the  setup.  To  insure  tight  coupl- 
ing at  the  input,  a pair  of  copper  rings  are  built  into  the  discharge  tube  so  that  direct 
contact  of  plasma  and  the  plates  is  achieved.  The  electric  field  E^  of  the  wavepacket 
propagating  along  the  column  is  picked  up  by  a similar  structure  and  is  observed  on  a 
sampling  scope  (Textronix  564).  All  experimental  phenomena  were  observed  during  a 
period  less  than  50ns  which  is  short  compared  with  the  fastest  collision  time  of  the 
plasma  (>  100ns).  Thus,  the  observations  are  free  from  any  undesirable  effects  such 
as  ionization  and  heating  usually  accompanying  high  amplitude  non-linear  experiments. 
A series  of  wide  band  attenuators  are  inserted  in  both  the  input  and  output  lines  to  allow 
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a wide  range  of  input  field  strengths  to  be  used.  Each  time  an  attenuator  is  removed 
from  the  input  line,  it  is  added  to  the  output  line  to  maintain  the  overall  loop  gain  con- 
stant. The  plasma  background  density  is  monitored  continuously  by  a microwave  cavity 
for  observing  long  term  stability.  The  short  term  stability  of  the  system  is  extremely 
good  when  no  axial  magnetic  field  is  applied.  This  is  demonstrated  by  the  oscillograph 
of  Fig.  2(a)  which  is  one  typical  observed  wavepacket  at  linear  level.  Each  dot  on  the 
oscillogram  represents  one  real  time  experiment.  These  oscillograms  can  often  be 
repeated  many  times  without  noticeable  changes.  From  the  time  required  to  complete 
one  oscillogram  (several  seconds)  and  the  repetition  rate  of  the  BBC  (-100 pps),  one 
can  estimate  that  the  experiments  are  repeatable  thousands  of  times  in  real  time.  When 
a DC  axial  B field  is  applied,  some  small  striations  (-100 Hz)  will  occur.  Therefore 
care  must  be  exercised  in  processing  the  data.  To  improve  accuracy,  an  X-Y  recorder 
driven  by  the  sampling  scope  was  used.  The  output  waveforms  were  recorded  on 
14"xl0”  recording  papers.  In  a slowly  recorded  waveform  the  striations  will  show  up 
as  slight  up  and  down  jitter  of  the  recording  pen  as  shown  typically  in  Figure  2(b).  Be- 
cause of  the  large  display,  there  is  no  difficulty  to  retrace  the  waveforms  to  smooth 
out  the  jitters.  The  smoothed  waveforms  were  digitalized  for  computer  Fourier  analysis 
with  the  aid  of  a carefully  devised  system  which  incorporates  a calibrated  grid,  a set 
of  high  quality  helipots  and  a digital  voltmeter.  The  resulting  amplitude  data  are  ac- 
curate to  the  second  digit  with  the  third  digit  approximate.  The  time  base  is  as  accurate 
as  that  of  the  sampling  scope.  We  note  that  if  the  jitter  were  included  in  the  data,  it 
would  only  appear  as  a fictitious  very  high  frequency  component  in  the  spectra. 

B.  Results 

Figure  3 shows  a typical  sequence  of  the  smoothed  out  output  waveforms  taken  at 
a distance  of  65  cm  from  the  input,  corresponding  to  the  plasma  background  given  earlier. 

Other  conditions  give  minor  variations  from  this  sequence.  Figure  3(a)  shows  the 
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response  at  the  linear  level,  which  is  the  familiar  Airy  function  ’ like  wavepacket. 

As  the  input  impulse  field  is  increased,  gross  non-linear  phenomena  begin  to  occur. 

From  Figs.  3(b) -(d),  one  observes  that  signals  arrive  sooner  than  the  linear  waves  and 
considerable  distortion  of  the  original  wavepacket  begins  to  take  place.  Figure  3(d) 
indicates  an  interference  of  waves.  To  analyze  these  waveforms,  computer  Fourier 
transforms  of  the  signals  were  performed.  At  least  10  data  points  were  used  in  one 
period  of  oscillation.  The  spectral  amplitude  and  phase  for  each  waveform  in  Fig.  3 
are  shown  in  Figure  4.  At  linear  level  Fig.  4(a),  only  one  peak  (C)  is  shown  correspond- 
ing to  the  wavepacket  of  Fig.  3(a).  The  slope  of  the  phase  curve  near  the  peak  repre- 
sents a time  delay  from  some  reference  point.  The  slight  curvature  of  the  phase 
characteristic  indicates  a slight  amount  of  FM  present  in  the  wavepacket.  Using  the 
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calculated  linear  group  velocity  and  the  phase  slope,  this  reference  point  can  be  estab- 
lished and  can  be  used  for  determining  the  group  velocities  of  the  other  wavepackets. 

At  higher  amplitude  two  distinct  phenomena  occurred  (Figures  4(b)-(d)).  First  is 
the  appearance  of  a low  frequency  peak  at  f = . 147GHz(A).  The  amplitude  of  this  peak 
increases  more  rapidly  as  the  input  level  is  increased  (recall  that  the  loop  gain  is  kept 
at  a constant).  There  is  a slight  low  shift  of  frequency  as  amplitude  is  increased.  From 
the  phase  slope  (i.e.  , time  delay),  we  can  show  a linear  amplitude  dependence  of  veloc- 
ity. No  further  analysis  of  this  packet  was  made  but  we  strongly  suspect  that  it  belongs 
to  the  class  of  solitary  waves. 

The  main  subject  of  this  report  is  on  the  change  associated  with  the  peak  C.  The 
peak  C corresponding  to  the  original  wavepacket  is  seen  here  to  break  into  two.  There 
is  a definite  shift  to  higher  frequency  of  the  original  peak.  A gradual  appearance  of 
another  peak  B at  a lower  frequency  indicates  another  wavepacket.  The  difference 
between  the  slopes  of  the  phase  of  these  two  peaks  yields  the  difference  in  the  times  of 
arrival  of  these  packets.  Therefore  a splitting  of  the  original  wavepacket  has  taken 
place.  The  higher  frequency  packet  C has  a longer  delay  and  hence  a lower  velocity. 

1 2 

According  to  the  modulation  theory,  ’ the  difference  between  the  two  group  ve- 
locities is  proportional  to  the  amplitude  as  shov/n  by  Equation  (3).  Hence  the  difference 
between  the  arrival  times  At  should  also  be  proportional  to  the  amplitude  to  the  first 
order.  This  is  shown  in  Figure  5.  The  observation  that  the  higher  frequency  wavepacket 
travels  at  a lov/er  velocity  is  also  in  agreement  with  the  modulation  theory. 

From  the  measured  value  and  the  dispersion  relation,  we  can  calculate  the  required 
field  amplitude  for  a given  velocity  difference  from  the  modulation  theory.  It  is  shown 
that  to  achieve  the  largest  separation  of  3.  4ns  over  a distance  of  65cm,  the  amplitude 
of  the  wavepacket  required  is  about  50C/cm.  This  field  represents  a total  energy  of 

_Q 

about  1.3x10  J in  the  wavepacket.  The  energy  contained  in  the  exciting  electric  field 
impulse  is  about  3x10  ^J.  Since  most  of  the  energy  of  the  impulse  goes  into  the  dum- 
my load  and  radiates  to  outside  the  parallel  plate  structure,  it  would  be  of  interest  to 
estimate  the  coupling  efficiency,  i.e.  , how  much  energy  can  be  coupled  into  the  plasma. 
As  a rough  bound  estimate,  we  assume  that  when  the  impulse  field  arrives  at  the  plasma 
tube,  the  2kV/cm  field  is  applied  to  all  electrons  in  the  region  between  the  plates  during 
the  pulse  duration  (t  ft!  0.  4ns),  and  each  electron  would  gain  a momentum  of  eEt  . The 
total  kinetic  energy  gained  by  all  electrons  in  the  region  in  this  way  is  of  the  order  of 
9x  10”^  J.  Comparing  the  orders  of  magnitude  of  the  three  energy  levels,  we  feel  that 
the  50V/cm  field  required  to  produce  the  measured  wavepacket  separation  is  certainly 
reasonable  for  the  experimental  setup  used. 
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Fig.  5.  Time  separation  vs.  amplitude. 

In  conclusion,  we  believe  that  we  have  shown  sufficient  evidence  that  the  wave- 
packet  splitting  effect  has  been  observed  as  predicted  by  the  modulation  theory.  This 
new  experimental  observation  should  complement  other  efforts  such  as  computer  simu- 
lations to  offer  some  confidence  in  the  application  of  the  modulation  theory  to  other  non- 
linear wave  problems. 
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RESONANT  COUPLING  OF  IONIZATION  WAVES  AND  ACOUSTIC  GRAVITY  WAVES 
IN  THE  PRESENCE  OF  A MAGNETIC  FIELD 

H.  EunandS.H.  Gross 

In  a recent  paper^  it  was  demonstrated  that  ionization  may  be  locally  resonant  at 
each  level  to  passing  neutral  gravity  waves  and  that  this  type  of  response  may  be  signi- 
ficant in  explaining  traveling  ionospheric  disturbances  (TID's)  in  the  F region  of  the 
ionosphere.  The  resonance  arises  from  strong  coupling,  for  certain  frequencies  and 
directions  of  propagation,  between  a neutral  wave  and  an  acoustic  wave  characteristic 
of  the  ionization. 

Though  most  of  the  examples  given  in  Ref.  1 were  for  media  without  a magnetic 
field,  they  gave  some  results  for  a magnetic  field  when  propagation  is  in  the  magnetic 
meridian.  It  was  stated  without  explanation  that  for  the  situation  in  which  there  is  a 
magnetic  field,  resonance  occurred  at  a given  frequency  for  two  different  directions  of 
propagation,  in  contrast  with  that  for  the  situation  in  which  there  is  no  magnetic  field 
and  in  which  there  are  two  resonant  frequencies,  each  having  its  own  direction.  In  this 
contribution  we  provide  fuller  details  for  a magnetic  field.  How  two  resonant  directions 
arise  is  explained  graphically.  In  addition,  results  for  propagation  out  of  the  meridian 
plane  are  presented  which  show  that  the  general  features  of  meridional  plane  propagation 
are  mostly  preserved,  though  the  direction  is  at  an  angle  to  that  plane. 

2 

In  an  early  paper  a similar  problem  was  treated.  The  analysis  was  incorrect 
because  it  neglected  the  effects  of  ions,  as  pointed  out  in  Reference  3.  Two  figures  in 
Ref.  2,  however,  on  examination  show  the  possibility  of  two  resonant  directions,  though 
the  presentation  is  somewhat  different  from  that  given  here.  Because  we  believe  the 
possible  existence  of  two  resonant  directions  is  more  important  than  may  have  been  ap- 
preciated, we  emphasize  this  feature  in  our  treatment  and  discussion. 

Though  still  other  authors  more  recently  have  analyzed  the  passage  of  acoustic 
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gravity  waves  through  the  ionosphere,  ' the  resonant  coupling  phenomenon  was  not 
recognized.  Reference  6 also  reviewed  and  discussed  original  work  on  resonant  coupling 
but  did  not  develop  the  theory  further.  The  relevance  of  these  previous  efforts  was  dis- 
cussed in  Ref.  1 and  will  not  be  repeated  here. 

As  was  discussed  in  the  work  reported  in  Ref.  1,  some  amount  of  losses  may  actual- 
ly enhance  coupling,  though  the  resulting  waves  are  damped  on  propagating  away  from 
the  coupling  region.  An  example  showing  resonant  coupling  with  thermal  conductivity 
losses  was  presented  in  that  paper  which  demonstrated  that  resonant  coupling  is  unaf- 
fected by  losses.  The  role  of  viscosity  in  the  F region  was  also  considered  and  deemed 
unimportant  in  relation  to  thermal  conductivity  as  a loss  mechanism.  In  view  of  these 
previous  results,  viscosity  and  thermal  conductivity  were  neglected  for  simplicity  in  the 
analysis  and  computations  presented  here. 
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With  the  exception  of  assumptions  related  to  the  magnetic  field  the  method  of  a- 
nalysis  is  that  described  in  Reference  1.  As  before,  we  linearize  by  a perturbation 
analysis,  use  Cartesian  geometry,  assume  thermodynamic  equiUbrium,  neglect  ioniza- 
tion production  and  loss,  assume  time  stationary  background  quantities,  treat  the 
ionization  as  a single  fluid  of  half  the  ion  mass  with  a content  minor  in  comparison  with 
the  neutrals,  and  utilize  the  plane  wave  approximation.  Though  the  last  assumption  is 
simpler  to  use,  the  same  essential  results  are  obtainable  from  a WKB  analysis  similar 
to  that  given  in  Reference  7.  Additional  assumptions  associated  with  the  magnetic  field 
are  that  the  field  lines  are  straight  but  inclined  as  given  by  the  dip  angle  I and  that 
motion  transverse  to  the  field  lines  is  negligible.  In  the  F region  the  electron  and  ion 
collision  frequencies  are  taken  to  be  much  smaller  than  their  respective  gyrofrequencies . 
These  various  assumptions,  however,  are  such  that  hydrotnagnetic  waves  are  eHminated 
from  consideration,  a matter  for  further  examination. 

Examples  are  given  showing  two  resonant  directions.  It  is  also  shown  that  a strong 
coupled  response  of  ionization  is  possible  over  a very  large  frequency  band  and  for  a 
fair  range  of  directions  that  in  some  cases  may  be  quite  wide.  The  possibility  is  illus- 
trated that  a peaked  lesponse  may  occur  over  an  altitude  range  of  the  order  of  100  km 
for  a wave  of  a particular  frequency  and  direction.  Such  an  altitude  range  would  sug- 
gest that  strong  coupling  is  not  as  localized  as  had  originally  been  thought.  Though  this 
suggestion  must  be  checked  by  a more  rigorous  analysis,  its  correctness  would  con- 
siderably augment  the  importance  of  the  resonance  mechanism  in  explaining  observed 
perturbations . 

First,  an  analysis  is  presented  in  which  the  ionization  density  perturbation  is  de- 
rived. The  resonance  and  its  properties  are  then  treated.  Calculations  for  typical 
cases  are  presented  afterward  to  show  the  effects  of  varying  both  the  dip  angle  and  the 
angle  between  the  plane  of  propagation  and  the  meridian  plane. 

A.  Analysis 

A Cartesian  coordinate  system  is  used  as  shown  in  Fig.  1,  the  x and  y axes  being 
directed  horizontally  toward  magnetic  north  and  west,  respectively,  and  the  z axis 
directed  vertically  upward.  The  magnetic  flux  density  vector  Bp  is  taken  to  be  lying  in 
the  xz  plane  with  a dip  angle  I that  is  positive  in  the  northern  hemisphere.  The  perturb- 
ing neutral  wave  propagates  in  the  direction  of  its  wave  number  vector  k located  with 
respect  to  the  coordinate  axes  by  the  colatitudinal  angle  0 measured  from  the  z axis  and 
by  the  azimuthal  angle  4>  measured  from  the  x axis.  Quantities  having  the  subscripts 
0 and  1 represent  background  and  perturbation  quantities,  respectively.  The  subscripts 
n,  e,  and  i designate  neutral,  electron,  and  ion  fluid  quantities,  respectively.  A quan- 
tity without  these  subscripts  refers  to  the  entire  medium.  The  symbols  p,  p,  n,  T, 
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Fig.  1.  Coordinate  system. 

C,  m,  g,  and  H represent  in  that  order  pressure,  mass  density,  number  density,  temp- 
erature, peculiar  or  diffusion  velocity,  mass,  gravitational  acceleration,  and  scale 
height.  Peculiar  velocity  C is  specified  in  relation  to  the  mean  mass  velocity  £^,  which 
is  taken  to  be  the  particle  motion  caused  by  the  perturbing  neutral  acoustic  gravity  wave. 

is  the  ambipolar  diffusion  coefficient,  and  v_  is  the  momentxim  transfer  collision 
frequency  of  species  i with  species  j. 

A two-fluid  system  is  used,  one  fluid  being  for  the  ionization  and  the  other  for  the 
neutrals.  The  equations  for  the  neutrals  may  be  replaced  by  those  for  the  entire  medi- 
um, since  the  ionization  content  is  very  small  in  relation  to  that  of  the  neutrals.  The 
background  ionization  is  assumed  to  undergo  ambipolar  diffusion  along  the  field  lines 
with  peculiar  velocity 


= -D. 
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where  is  the  scalar  component  of  the  del  operator  along  the  Reid  line,  a^  is  the  vinit 
vector  along  that  line,  and  is  the  background  ambipolar  diffusion  velocity.  The 
background  neutral  motion,  however,  is  taken  to  be  zero,  and  the  unperturbed  neutrals 
are  distributed  hydrostatically.  Neutral  particles  move  only  with  the  wave  perturbation 
velocity  £j  which  is  taken  to  be  the  mean  mass  velocity. 

The  fjerturbed  motion  of  the  ionization  is  described  by  the  following  set  of  hydro- 
dynamic  equations: 


^Wii  + VqW.j  + 2(cos  I - sin  I ^)p.j  - p.jg  sin  1 
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^Pil+  V^Wi,*  V^(Pi„Cib)=0  (3) 

Pil  = "ilVo  + 

Here  K,  is  Boltzmann's  constant,  W.  = p.C.  is  the  ionization  momentum  flux  along  the 
b X 1 ® 

field  line,  and  is  the  component  of  the  neutral  particle  velocity  along  the  field 
line.  Equation  (2)  is  the  momentum  equation  for  the  component  along  the  field  line, 

Eq.  (3)  is  the  continuity  equation,  and  Eq,  (4)  is  the  perturbed  equation  of  state.  The 
temperature  perturbation  is  determined  only  by  the  neutral  medium's  response  to  the 
acoustic  gravity  wave  because  of  the  assumption  of  thermodynamic  equilibrium.  Wave 
quantities  for  the  neutral  medium  obey  the  well-known  acoustic  gravity  wave  dispersion 
relationship,  and  perturbation  quantities  that  are  solely  due  to  neutral  waves  in  Eqs. 
(2),  (3),  and  (4)  are  regarded  either  as  source  terms  driving  the  ionization  or  as  terms 
coupling  the  neutral  motion  to  the  ionization.  The  motion  of  the  neutrals  is  assumed  to 
be  known,  and  the  purpose  of  the  analysis  is  to  find  the  response  of  the  ionization  to 
this  perturbation. 

Since  we  seek  a plane  wave  solution,  we  take  (9/9x)  = ik^  and  (9/9z)  = ik^.  Then, 
with  (9/9t)  = -iu),  Eq.  (2)  through  Eq.  (4)  may  be  utilized  to  yield  the  ionization  density 
perturbation  as  given  by 
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k,  = k cos  I - k sin  I 
D X z 
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a modified  ambipolar  diffusion  coefficient,  and  prime  and  double 

prime  notations  above  background  quantities  mean  first  and  second  derivatives  with 
respect  to  z,  respectively.  Equation  (5)  was  given  in  Ref.  1 without  derivation. 

Equation  (5)  illustrates  the  dependence  of  the  ionization  density  perturbation  on 
background  quantities  as  well  as  the  neutral  wave  perturbation  quantities  and  Tj. 

The  neutral  density  perturbation  also  enters  the  formulation  through  v^. 


B.  Resonance 

Resonance  occurs  when  the  denominator  in  Eq.  (5)  is  zero.  It  should  be  noted 
that  the  dispersion  relationship  for  ionization  acoustic  waves  with  particle  motion  con- 
strained along  the  field  line  is  given  by  the  same  expression  when  it  is  set  equal  to  zero. 

The  characteristics  of  resonance  are  most  easily  demonstrated  for  the  case  of  a 
lossless  isothermal  atmosphere  when  we  take  w « v^.  Eigenvalues  of  the  ionization 
acoustic  wave  may  then  be  obtained  from 


(9) 


which  is  obtained  from  the  denominator  of  Eq.  (5)  by  using  these  simplifying  assumptions. 


The  k vector  of  the  neutral  wave  is  complex,  since  the  z component  has  an  im- 
aginary part  (-1/2  H^)  that  results  in  the  growth  of  the  relative  perturbation  density  and 
the  perturbation  velocity  with  altitude.  On  separating  Eq.  (9)  into  real  and  imaginary 
parts,  one  obtains 
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where  k^^  is  the  real  part  of  k^^.  The  w in  Eq.  (11)  will  be  the  critical  coupUng  or  res- 
onance frequency  for  the  given  dip  angle  and  altitude  if  the  acoustic  gravity  wave 
number  vector  k,  with  a component  along  the  magnetic  field  kj^,  and  satisfy  the 
acoustic  gravity  wave  dispersion  relationship.  Unlike  the  similar  situation  without  a 
magnetic  field,  as  given  in  Ref.  1,  does  not  enter  the  ionization  relationships  Eqs. 
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(10)  and  (11)  obtained  from  Eq.  (9).  There  is  now  only  one  wave  vector  component, 

It  is  determined  entirely  by  the  dip  angle,  neutral  particle  scale  height  (or  temperature), 
and  the  collision  frequency  through  the  diffusion  coefficient  The  frequency  oj  = 

in  Eq.  (11)  also  depends  on  these  same  quantities.  In  the  northern  hemisphere,  I is 
taken  as  positive,  though  the  field  lines  point  downward.  From  Eq.  (11)  it  is  obvious 
that  must  be  positive;  that  is,  the  real  part  of  the  component  of  k along  the  field 
line  must  be  downward  along  the  field  line.  This  would  imply  that  the  real  part  of  the 
vector  k,  namely  k^,  would  very  likely  be  downward  as  well.  In  the  southern  hemi- 
sphere, sin  I is  negative  and  k^^^  must  be  negative  or  downward,  too.  The  neutral  wave 
vector  k or  k^  is  also  likely  to  be  downward. 

The  critical  values  of  k,  k^,  and  k^  are  found  for  a given  altitude  and  dip  angle  by 
simultaneously  solving  Eqs.  (10)  and(ll)with  the  dispersion  relationship  for  acoustic-gravity 
waves. ^ This  procedure  yields,  in  general,  two  directions  for  resonance  for  the  same 
frequency  cj  = given  by  Equations  (10)  and  (11).  One  may  proceed  to  treat  this 
problem  analytically,  but  it  is  preferable  for  the  purposes  here  to  utilize  geometrical 
diagrams  to  illustrate  the  nature  of  the  resonance  and  how  it  changes  with  dip  angle. 

It  is  best  to  show  these  relationships  for  propagation  in  the  plane  of  the  magnetic  merid- 
ian. The  characteristics  are  essentially  the  same  for  propagation  out  of  the  meridian, 
as  one  may  show  by  projecting  the  wave  vector  surface  for  the  neutral  medium  into  the 
meridian  plane  and  proceeding  as  described  below. 

The  relationship  between  k,  k^,  and  k^^  must  be  appreciated.  Here  k^,  which 
is  the  real  part  of  k,  is  given  by  k = k - ik  . a , where  k . is  the  imaginary  part  of  k 

^ ^ JT  Z1”’Zp  Z1  Z 

and  a is  the  unit  vector  along  z.  Thus  k = a k + a k where  k is  the  real  part 
of  k^  and  is  the  unit  vector  along  x.  Then  the  real  part  of  k^,  is  the  component 

of  k ^ along  the  magnetic  field  line.  We  may  then  tc  ke  the  vector  to  be  the  vector 
along  the  direction  of  with  magnitude  Given  all  possible  k^  with  this 

component  k^^  must  have  their  vector  arrowheads  lying  along  a line  perpendicular  to 
k^^,  the  Ene  passing  through  the  arrowhead  of  The  construction  for  the  northern 

hemisphere  is  shown  as  line  E in  Figure  2. 

Figure  2 represents  the  k^  vector  space  in  the  meridian  plane.  Also  shown  in  the 
figure  are  two  hyperboEc  curves  representing  the  gravity  wave  dispersion  relationship 

in  this  plane  for  the  frequency  w = u and  the  altitude  of  interest.  Line  L intersects 

^ (1)  (2) 

each  hyperbola  as  shown,  and  these  intersections  give  the  wave  directions  «^nd 

that  satisfy  both  Eqs.  (10)  and  (11)  and  the  gravity  wave  dispersion  relationship.  Any 

wave  of  frequency  w = traveling  in  the  neutral  medium  in  either  of  these  two  directions 

at  the  particular  altitude  would  result  in  resonant  coupEng  to  ionization  acoustic  waves. 

These  acoustic  waves  would  then  leave  the  coupEng  region  and  propagate  on  their  own. 


L 


Fig.  2.  Existence  of  double  resonances  for  a given 
dip  angle  for  the  gravity  wave  branch. 

The  geometrical  analysis  also  exhibits  the  changes  in  the  nature  of  the  resonance 
as  the  dip  angle  is  increased  from  small  values  to  larger  values  for  a given  altitude. 

The  frequency  u = w increases  with  sin^  and  decreases  inversely  with  background 
density  (Equations  (10)  and  (11)).  For  small  values  of  I,  is  relatively  small  at  any 
1 given  altitude,  and  the  required  waves  in  the  neutral  medium  are  likely  to  be  in  the 

gravity  wave  range.  The  hyperbolas  are  steep,  and  intersections,  or  wave  coupling 
directions,  are  in  the  third  and  fourth  quadrants  of  the  k^  plane  for  the  northern  hemi- 
sphere, that  is,  in  the  downward  direction  to  the  north  (+x  or  +h^)  and  south  (-x  or  -kj^). 

It  is  apparent  from  Fig.  2 that  the  slope  of  line  L for  small  I is  smaller  than  the 
asymptotic  slope  of  the  first -quadrant  branch  of  the  hyperbola.  It  can  be  shown  that  the 
latter  slope  decreases  more  rapidly  as  1 increases  than  the  slope  of  line  L.  At  som* 
value  of  I the  slopes  are  the  same  and  there  is  only  one  intersection  which  is  in  the  fourth 
quadrant.  Further  increases  in  I produce  intersections  in  the  first  and  fourth  quadrant, 
as  shown  by  Lj  in  Fig.  3 for  frequency  u>^,  yielding  both  upward  and  downward  northerly 
directions  for  resonance.  As  I increases  further,  the  hyperbolas  move  toward  higher 
I values  of  k^,  and  eventually,  line  L is  just  tangent  to  the  hyperbola,  as  shown  by  at 

, frequency  > Wj  in  Figure  3.  Resonance  is  not  possible  in  the  gravity  wave  branch  for 

t ^ further  increases  in  I,  so  that  represents  the  upper  cutoff  of  the  phenomenon  in  this 

t ' branch.  The  cutoff  frequency  is  less  than  the  Brunt-Vaisalai  frequency  at  the  particular 

. altitude.  At  high  enough  altitudes,  further  increases  in  I may  permit  resonance  at 
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Fig.  3.  Same  as  Fig.  2 but  for  larger  values  of 
dip  angle.  The  upper  limit  of  I for 
resonance  in  the  gravity  wave  branch  is 
shown  at  I = l2»  w = w^- 

frequencies  in  the  acoustic  branch  of  the  acoustic-gravity  wave  dispersion  relationship, 
as  shown  in  Fig.  4,  in  which  line  L intersects  ellipselike  wave  vector  curves  that  are 
characteristic  of  this  branch.  for  cj  = Wj  in  Fig.  4 represents  the  low-frequency  cut- 
off resonance  in  the  acoustic  branch.  Here  is  greater  than  the  acoustic  branch  cutoff 
frequency  exhibits  two  resonant  directions  for  frequency  > Uj.  Acoustic 

branch  coupling  may  possibly  occur  for  altitudes  above  500km.  It  should  be  noted  that 
Ref.  8 reports  localized  ionospheric  disturbances  at  acoustic  frequencies.  Acoustic 
branch  coupling  within  the  assumptions  here  is  conceivable  up  to  I = 90*^. 

C.  Calculations 

Calculations  are  made  for  various  altitudes  and  various  directions  of  propagation 
of  the  acoustic- gravity  wave  in  the  neutral  medium.  The  background  model  that  is  ut- 
ilized is  the  same  as  that  given  in  Figs.  1 and  2 of  Reference  1. 

Figures  5 and  6 are  polar  plots  of  the  ionization  perturbation,  normalized  to  the 
zero  frequency  response,  in  the  meridional  plane  in  the  northern  hemisphere  for  an 
altitude  of  400km  in  Fig.  5 and  250km  in  Figure  6.  A normalized  plotted 

in  each  figure  versus  0,  the  colatitudinal  angle  in  Figure  1.  Zero  degrees  is  along  the 
upward  vertical.  The  angles  are  labeled  positively  from  0°  to  180°  for  the  right-hand 
or  north-directed  (+x)  side  and  negatively  from  0°  to  -180°  for  the  left-hand  or  south- 
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Fig.  4.  Existence  of  double  resonances  in  the  acoustic  wave 
branch.  The  lower  limit  of  I for  resonance  is  shown 
at  I = Ij , <«>  = . 

directed  (-x)  side.  North  is  along  6 = 90°,  and  south  along  -90°.  The  dip  angle  for 
Figs.  5 and  6 is  40°. 

One  may  observe  in  each  figure  that  there  are  six  curves,  which  may  be  divided 
into  two  sets  of  three  curves  each,  that  are  placed  nearly  symmetrically  with  respect 
to  each  other  about  the  vertical  axis.  One  curve  of  each  set  is  a resonance  curve  having 
a theoretically  infinite  peak  reaching  the  edge  of  the  graph  in  the  figure.  The  resonance 
frequency  is  given  in  the  caption  for  each  figure.  The  other  two  curves  in  a set  are 
for  frequencies  that  are  50%  greater  and  50%  less  than  the  resonance  frequency.  Though 
infinite  responses  are  not  obtained  for  these  frequencies,  which  are  well  off  resonance, 
they  still  exhibit  large  peaks  (the  scale  is  logarithmic).  The  directions  of  their  peaks, 
however,  are  not  the  same  as  the  direction  for  resonance.  All  are  downward  pointing, 
but  the  direction  of  the  peak  for  the  50%  greater  frequency  is  the  furthest  from  the  verti- 
cal of  the  three  frequencies,  whereas  that  for  the  50%  lower  frequency  is  closest  to  the 
vertical. 

Peaked  responses  within  each  set  are  evident  in  Fig.  5 with  a 15°  spread  of  direc- 
tion for  a 3:1  frequency  range.  The  two  sets,  or  the  two  resonant  directions,  combined 
account  for  a 30°  band  of  directions.  The  3:1  frequency  range  was  arbitrarily  chosen, 
and  one  may  expect  strong  ionization  response  over  a greater  angular  range  for  a larger 
frequency  band.  Responses  for  the  same  altitude  for  larger  dip  angles  yield  much 
wider  angular  spreads  than  the  angular  spread  in  Figure  5.  Peaked  responses  within 
each  set  as  described  for  Fig.  5 are  found  to  be  as  great  as  50°  about  the  resonant 


•(•I/*  He 
*>>3/2Wc 


U>c  is  1 . 79x  10-3  rad/sec,  and  its  di- 
rections are  9^  = 162.  60°  for  northward 
propagation  and  0c  = -164.  10°  for  south- 
ward propagation. 
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direction  for  a dip  angle  of  80°.  On  the  other  hand,  the  angular  spread  in  Fig.  6,  which 
is  for  a lower  altitude,  is  much  smaller.  Though  the  dip  angle  is  the  same  as  that  in 
Fig.  5,  the  total  spread  in  both  quadrants  for  the  3:1  frequency  range  is  about  10°. 

Figures  5 and  6 show  that  near -resonant  response  is  possible  over  a significant 
altitude  range  for  the  same  dip  angle  rather  than  just  locally.  Since  the  resonance  fre- 

quency  of  Fig,  5 at  an  altitude  of  400km  is  o)  = 1.794x10  rad/sec,  1/2  oj  , as  used 

4 ^ ^ 

in  Fig.  5,  is  8.97x10"^  rad/sec.  The  resonance  frequency  for  Fig.  6,  which  is  for  an 

altitude  of  250 km,  is  = 2.  47 x 10  ^ rad/ sec,  and  ^/2w^  is  3.7x10  ^ rad/ sec,  for 
Figure  6.  Strong  peaking  occurs  at  these  frequencies,  that  is,  at  1/2  at  400km  and 
3/2  at  250km.  Though  these  frequencies  differ  by  a little  more  than  a factor  of  two, 
it  is  quite  apparent  that  for  some  difference  in  altitude  that  is  less  than  the  150  km  dif- 
ference between  250km  and  400km,  1/2  at  the  higher  altitude  and  3/2  at  the  lower 
altitude  may  indeed  be  equal.  In  fact,  it  can  be  shown  that  for  a difference  in  altitude 
equal  to  about  1.  1 H^,  these  two  frequencies  will  be  equal.  For  the  altitude  range  of 
interest,  1.  1 H is  some  90  to  100km.  For  factors  other  than  1/2  and  3/2  this  altitude 
range  would  be  somewhat  greater  or  smaller,  accordingly.  It  is  also  interesting  to 
note  that  the  direction  for  1/2  at  the  higher  altitude  is  closer  to  the  vertical  than  the 

direction  for  w at  that  altitude  and  that  the  direction  for  3/2  cj  at  the  lower  altitude  is 
c ' c 

further  from  the  vertical  than  the  direction  for  co  at  the  same  altitude.  These  directions, 

c 

the  one  for  1/2  at  the  higher  altitude  and  the  one  for  3/2  at  the  lower  altitude, 
tend  to  overlap,  as  is  necessary  if  the  resonance  is  to  be  characteristic  of  a sizable 
range  of  altitude. 

Tables  I and  II  show  the  effects  of  propagation  out  of  the  meridian  plane,  <t>  being 
used  to  represent  the  angle  out  of  the  meridian  plane.  Propagation  is  assumed  to  be  in 
the  northern  hemisphere.  In  the  tables,  <j)  takes  on  the  values  0°  to  90°.  Table  I com- 
pares resonance  values  for  k^^  and  the  polar  angle  0^  for  the  altitudes  and  dip  angles 
of  Figures  5 and  6.  The  sign  of  0^  determines  whether  propagation  is  in  the  northerly 
or  the  southerly  direction.  The  two  wave  vector  directions  are  indicated  by  superscripts 
(1)  and  (2).  Table  II  gives  calculations  for  a high  altitude  (600km)  at  various  dip  angles. 
This  altitude  is  chosen  because  calculations  for  it  exhibit  the  various  changes  that  may 
occur  as  dip  angle  is  varied,  as  has  been  described  in  Section  B. 

It  may  be  observed  in  Table  I that  the  characteristics  at  an  altitude  of  250km  bare- 
ly change  as  <j>  increases  from  0°  to  90°.  At  an  altitude  of  400  km,  some  change  in 
k^^^  and  k^^^  is  evident  as  <|>  changes.  The  angles  0^^^  and  0^^^  hardly  change.  One 
direction  of  propagation  is  in  the  northwest  quadrant  at  both  altitudes,  whereas  the  other 
is  in  the  southeast  quadrant.  There  is  also  symmetry  about  the  north-south  axis,  and 
so  the  same  results  are  applicable  in,the  northeast  and  southwest  quadrants  with  their 
corresponding  values  of 
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TABLE  I.  Wave  nvmnber  k and  angle  0 versus  <b  for  two  altitudes  and  for  I = 40°. 

rc  ® c ^ 


20°  40°  60°  80°  90° 


250km,  o)  = 2.47x10 
c 


(1) 

rc 

2. 102x10'^ 

2. 103x10'^ 

2. 107x10'^ 

2. 121x10'^ 

2. 143x10'^ 

2. 170x10'^ 

2.  185x10 

c 

178.09 

178.  09 

178. 09 

178.  09 

178. 10 

178. 10 

178. 10 

(2) 

rc 

2. 275x10'^ 

2. 273x10'^ 

2. 269x10'^ 

2.253x10'^ 

2. 229x10'^ 

2. 200x10'^ 

2. 185x10 

,(2) 

-178. 11 

-178. 11 

-178. 11 

-178. 11 

-178. 11 

-178. 10 

-178. 10 

c 


400km,  = 1.79x10'-’ 

k*^^  1.223x10'^  1,229x10'^  1.245x10*^  1.309x10'^  1.419x10'^  1.576x10'^  1.672x10 
rc 

0^^'  162.60  162.61  162.66  162.84  163.09  163.36  163.50 


c 

k<2) 

rc 

2. 524x10'^ 

2. 505x10'^ 

2. 450x10'^ 

2. 260x10'^ 

2. 018x10'^ 

e(2) 

-164. 10 

-164. 10 

-164. 07 

-163. 98 

-163. 83 

1.779x10'^  1.672x10 
-163.62  -163.50 


-5 


-5 


-5 


-5 


In  Table  II,  one  sees  at  I = 7°  that  0^  barely  changes  value.  Both  0^^^  and 
correspond  to  steep  third-  and  fourth-quadrant  directions.  Both  values  for  k^^,  on  the 
other  hand,  change  by  factors  between  2 and  3 as  <)>  increases  from  0°  to  90°.  At  I = 10°, 
one  wave,  as  given  by  k^|^^  and  0^^\  hardly  changes  direction,  but  the  wave  number  in- 
creases by  a factor  of  more  than  2 as  if)  increases.  The  direction  of  propagation  is  in 
the  fourth  quadrant.  The  other  wave,  for  small  (|),  is  in  the  first  quadrant.  However, 
when  <()  = 20°,  we  find  that  it  has  turned  steeply  downward  in  the  third  quadrant.  This 
means  that  for  some  angle  between  <()  = 10°  and  ^ = 20°  the  slope  of  line  L in  Fig.  2 must 
equal  the  asyr  itotic  slope  of  the  projected  hyperbola  in  the  meridian  plane.  For  this 
case  there  is  only  one  resonance  and  it  is  in  the  fourth  quadrant.  As  ()>  increases  further, 
the  angle  of  propagation  barely  changes,  though  k^^^  decreases.  For  I = 20°  the  same 
features  as  those  for  I = 10°  are  observed.  However,  0^  ' increases  from  about  114 
to  135°,  the  wave  becoming  steeper  with  increasing  <|).  The  angle  0^,^^  lies  in  the  first 
quadrant  out  to  at  least  <{)  = 60°.  It  changes  to  the  third  quadrant  between  <(>  = 60°  and 
()>  = 80°.  For  I = 40°  the  frequency  has  increased  to  a value  that  must  lie  in  the  acoustic 
branch  of  the  acoustic -gravity  wave  dispersion  relationship.  There  is  no  resonance 
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TABLE  II.  Wave  number  and  angle  0^  versus  <()  and  I for  an  altitude  of  600  km. 


I = 7^",  w = 
c 


5. 16x10' 


(1) 

re 

6. 884x10'^ 

6.944x10'^ 

7. 127x10'^ 

7.887x10"^ 

9. 269x10'^ 

1. 151x10“^ 

1. 310x10 

c 

173, 68 

173.70 

173. 76 

173.96 

174.  23 

174.49 

174. 51 

(2) 

re 

4. 641x10'^ 

4. 473x10'^ 

4. 039x10'^ 

2. 941x10'^ 

2.070x10"^ 

1. 505x10'^ 

1.310x10 

,(2) 

c 

-174.96 

-174.  96 

-174.  95 

-174. 91 

-174. 83 

-174. 69 

-174, 51 

I = 10°.  w = 1, 05x10"^ 
c 


(1) 

‘rc 

5. 530x10'^ 

5. 598x10'^  5. 803x10'^ 

6. 664x10  ^ 

8.279x10"^ 

1. 109x10'^ 

1. 326x10 

c 

165. 88 

165.96 

166.  19 

166.  98 

167. 90 

168.70 

169. 01 

(2) 

‘rc 

6.  172x10''^ 

2.250x10'^  3.300x10"^ 

6. 162x10'^ 

2.758x10"^ 

1. 629x10"^ 

1. 326x10 

,(2) 

c 

10. 210 

10.  210 

-169,  79 

-169. 75 

-169. 61 

-169. 27 

-169. 01 

I = 20°,  <0^  = 4.  06x10'^ 


4. 461x10'^ 

4. 525x10'^ 

4. 723x10'^ 

5. 637x10'^ 

7. 674x10’^ 

1. 255x10'^ 

2. 110x10 

113. 67 

114.45 

116.  60 

123. 08 

129. 43 

133. 91 

135. 49 

9. 443x10'^ 

9.757x10'^ 

1.078x10'^ 

1. 672x10'^ 

5. 861x10'^ 

3. 327x10"^ 

2, 110x10 

48. 130 

47. 83 

47. 04 

44.  94 

43.  56 

-136. 18 

-135. 49 

I = 40°,  u = 1. 43x10'^ 
c 


k<‘) 

rc 

1, 257x10'^ 

1. 258x10' 

^ 1,259x10'^ 

1.258x10'^ 

1.218x10 

C 

81.72 

82.59 

85,  28 

97.  45 

125. 86 

k<2) 

rc 

1. 150x10'^ 

1. 150x10' 

^ 1. 150x10'^ 

1, 151x10'^ 

1. 156x10 

r 

173, 29 

173.  18 

172, 82 

170.94 

163. 86 

No  Resonance 


No  Resonance 
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at  some  angle  <j>  beyond  60°.  Both  and  barely  change  with  ij),  and  the  directions 
of  propagation  are  in  the  first  and  fourth  quadrant  for  (j)  less  than  some  value  between 
20°  and  40^.  For  <)>  > 40°,  waves  are  resonant  in  the  fourth  quadrant. 

One  may  also  inspect  Table  II  for  fixed  (|>  as  I increases.  Most  of  the  variations 
as  described  by  using  the  diagrams  in  Figs.  2 to  4 are  observable,  such  as  the  change 
of  one  of  the  resonant  directions  from  third -quadrant  angles  to  first-quadrant  angles  as 
I increases  for  a fixed  <)>,  It  should  be  noted  that  between  I = 20°  and  I = 40°  the  resonant 
waves  change  from  the  gravity  wave  branch  to  the  acoustic  wave  branch.  For  a range 
of  values  of  I between  20°  and  40°  and  for  ^ < 60°  the  resonance  must  be  cut  off,  since 
the  dip  angle  lies  between  the  tangent  cases  exhibited  in  Figures  3 and  4.  However, 
this  cutoff  is  exhibited  only  for  (J>  = 80°  and  <)>  = 90°  in  the  table,  I = 40  lying  in  the  cut- 
off range  for  these  values. 

D.  Discussion 

■ 

The  sections  above  have  demonstrated  the  nature  of  the  two  resonant  directions 

that  occur  for  a single  frequency  when  a magnetic  field  is  present  and  have  shown  how 

the  resonances  change  with  dip  angle  I and  the  angle  <(>,  the  angle  of  propagation  from 

the  meridian  plane.  The  possibility  of  acoustic  branch  resonances  is  exhibited.  It  is 

most  important  to  note  that  the  calculated  frequencies  and  directions  for  resonance  turn 

out  to  be  in  the  range  of  observed  values  for  TID's  as  measured  by  incoherent  scatter 

5 H 1 6> 

techniques,  other  ground  measurements,  and  satellite  measurements.  ’ These  re- 

sults are  considered  significant  and  indicative  of  the  possible  connection  of  TID's  and 
the  resonance  phenomenon. 

It  has  been  demonstrated  that  a strong  resonance  type  of  response  may  be  possible 
in  the  F region  at  a particular  frequency  from  a region  that  may  be  as  great  as  100km 
in  altitude.  If  this  premise  is  confirmed  by  a full  wave  analysis,  the  resonance  is  far 
less  localized  in  altitude  than  may  have  been  expected.  The  resonance  phenomenon  also 
produces  a strong  response  at  each  altitude  over  a relatively  large  frequency  range,  at 
least  as  much  as  3:1  about  the  resonance  frequency,  and  over  a sizable  range  of  directions 
that  may  be  at  least  as  large  as  10°  in  width  at  lower  altitudes  and  even  much  greater 
at  higher  altitudes. 

Experimental  verification  of  the  coupled  waves  is  essential.  However,  since  an 
ionization  acoustic  wave  excited  by  coupling  has  the  same  frequency  and  wave  vector  as 
the  acoustic  gravity  wave  perturbing  the  medium,  it  is  difficult  to  distinguish  the  two 
waves  by  present  methods  that  have  been  used  to  detect  TID's.  Nevertheless,  it  is  of 
interest  to  analyze  TID  data  on  frequency  and  wave  number  statistically  with  respect  to 
values  expected  for  coupled  waves.  The  simultaneous  measurement  of  TID  character- 
istics and  atmospheric  and  ionospheric  background  parameters  would  aid  considerably 
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in  identifying  the  coupled  wave.  Unfortunately,  such  background  data  are  not  normally 
available,  so  that  special  experimental  efforts  are  necessary.  Experiments  that  take 
advantage  of  any  of  the  characteristics  of  the  ionization  acoustic  wave  may  also  be  con- 
ceivable, but  further  study  of  such  possibilities  is  needed. 

More  theoretical  work  is  necessary,  particularly  a full  wave  analysis  to  confirm 
whether  the  suggested  strong  nonlocalized  response  is  real,  with  viscosity  and  thermal 
conductivity  losses  incorporated  as  well.  When  excitation  is  strong,  nonlinear  effects 
may  enter,  and  such  complications  deserve  treatment  as  well. 

National  Aeronautics  and  Space  Administration 
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REFERENCES 

1.  H.  Eun  and  S.H.  Gross,  "Ionospheric  Disturbances  and  Gravity  Waves,"  J.  Geo- 
phys.  Res.,  3261-3270  (1976). 

2.  C.O.  Hines,  "Electron  Resonance  in  Ionospheric  Waves,"  J.  Atmos.  Terr.  Phys., 

9,  56-70  (1956). 

3.  C.O.  Hines,  "Internal  Atmospheric  Gravity  Waves  at  Ionospheric  Heights,"  Can.  J. 

Phys.,  38,  1441-1480(1960). 

4.  R.M.  Clark,  K.C.  YehandC.H.  Liu,  "Interaction  of  Internal  Gravity  Waves  with 
the  Ionospheric  F2-layer,"  J.  Atmos.  Terr.  Phys.,  33,  1567-1576  (1971). 

5.  J.  Testud  and  P.  Francois,  "Importance  of  Diffusion  Processes  in  the  Interaction 

Between  Neutral  Waves  and  Ionization,  " J.  Atmos.  Terr.  Phys.,  33 , 7 65-774  (1971).  ( 

6.  C.O.  Hines,  "The  Upper  Atmoshpere  in  Motion,"  Geophys.  Monogr.  Ser.  , 18, 

American  Geophysical  Union,  Washington,  D.C.  (1974).  | 

7.  S.H.  Gross  and  H.  Eun,  "Traveling  Neutral  Disturbances,"  Geophys.  Res.  Lett., 

3,  257-260  (1976). 

8.  T.M.  Georges,  "H  F Doppler  Studies  of  Traveling  Ionospheric  Disturbances,"  J.  i 

Atmos.  Terr.  Phys.,  30,  735-746  (1968). 

9.  K.  Davies  and  J.E.  Jones,  "Three-dimensional  Observations  of  Traveling  Ionospheric 
Disturbances  ,"  J.  Atmos . Terr.  Phys.,  33,  39-46  (1971). 

10.  P.  L.  Dyson,  G.P.  Newton  and  L.H.  Brace,  "In  Situ  Measurements  of  Neutral  and 
Electron  Density  Wave  Structure  from  the  Explorer  32  Satellite,  " J.  Geophys.  Res.  , 

75,  3200-3210  (1970). 

11.  J.P.  Friedman,  "Propagation  of  Internal  Waves  in  a Thermally  Stratified  Atmos- 
phere, " J.  Geophys.  Res.,  7 1 , 1033-1054  (1966). 

12.  C.S.G.K,  Setty,  A.B.  Gupta  and  O.P.  Nagpal,  "Ionospheric  Response  to  Internal 
Gravity  Waves  Observed  at  Delhi,  " J.  Atmos.  Terr.  Phys.,  35,  1351-1361  (1973). 

13.  J.  Testud,  "Gravity  Waves  Generated  During  Magnetic  Substorms,  " J.  Atmos.  Terr. 
Phys.,^,  1793-1805  (1970). 

14.  G.  Thome,  "Long-period  Waves  Generated  in  the  Polar  Ionosphere  During  the  Onset 
of  Magnetic  Storms,"  J.  Geophys,  Res.,  7^,  6319-6336  (1968). 

15.  G.  Vasseur  and  P.  Waldteufel,  "Thomson  Scatter  Observations  of  a Gravity  Wave 
in  the  Ionospheric  F-region,"  J.  Atmos.  Terr.  Phys.,  M,  885-888  (1969). 

I S.  K.C.  Yeh  and  C.H.  Liu,  "Acoustic  Gravity  Waves  in  the  Upper  Atmosphere,  " Rev. 

Geophys.  Space  Phys.,  J_2,  193-216  (1974). 


WAVE-MATTER  INTERACTIONS 


213 


A KELVIN -HELMHOLTZ  TYPE  INSTABILITY  IN  PLASMA  STRIA TIONS 
\ 

H.  Kudyan  and  K.  Chung 

It  was  reported  previously  that  plasma  striations  are  closely  associated  with  low- 
frequency  weak  plasma  turbulence  and  strong  ion  beams  along  the  confining  magnetic 
field.^  Recognizing  the  role  of  ion  beams  in  splitting  plasma  sheets  and  inducing  distinct 
spectral  peaks,  we  have  further  carried  out  experimental  investigations  on  the  weak 
plasma  turbulence  caused  by  an  ion-beam  induced  instability. 

Although  the  cross-sectional  profile  of  a magnetically  confined  electron  cyclotron 
resonance  plasma  may  take  a large  number  of  different  apparent  structures,  there  ex- 
ist fundamentally  two  types  of  profiles;  hollow  (cylindrical)  column  and  fully-developed- 
column.  Hollow  column  plasma  is  very  much  susceptible  to  breaking  into  striations 
and  usually  exhibits  plasma  instability  which  develops  into  weak  plasma  turbulence  as 
striations  form.  The  striated  plasma  column  may  actually  be  termed  as  the  fully  de- 
veloped weakly  turbulent  state  of  hollow  column  plasma.  On  the  verge  of  breaking  up, 
hollow  column  plasma  exhibits  very  strong  oscillations  caused  by  a plasma  instability. 

Using  a phase -sensitive  lock-in  amplifier  loop  for  measuring  the  wave -pattern  in 
the  plasma  and  employing  concentric  electrodes  to  control  the  axial  ion-beams,  we  have 
observed  the  frequency  variation  of  the  instability  and  the  growth-decay  behavior  of  the 
instability  in  helium,  nitrogen,  air  and  argon  discharges.  In  addition,  we  have  measured 
the  distribution  functions  using  small  electrostatic  analyzers^  and  the  plasma  potential 
by  electrostatic  probes. 

Until  the  instability  develops  into  weak  turbulence,  we  observe  standing  wave  be- 
havior in  both  the  radial  and  axial  directions.  Such  standing  wave  patterns  deteriorate 
as  the  plasma  becomes  weakly  turbulent  and  breaks  into  striations.  Whereas  the  oscil- 
lation frequency  is  roughly  in  proportion  to  the  background  potential  radial  gradient,  it 
is  virtually  independent  of  the  mass  of  ions  and  the  magnitude  of  the  confining  magnetic 
field.  In  addition,  it  is  not  possible  to  match  the  observed  frequency  with  either  the 
ion  acoustic  wave  frequency  or  ion  cyclotron  frequency  based  on  ranges  of  parameters 
of  our  plasma.  The  frequency  range,  20  ~ 90KHz,  indicates  strongly  low-frequency 
electrostatic  instability  as  the  cause  of  the  oscillations. 

In  view  of  the  strong  ion  beams  in  the  unstable  plasma,  we  have  made  a detailed 
measurement  of  the  flow  pattern  of  ion  beams.  In  Fig.  1,  we  show  a typical  measure- 
ment of  the  radial  potential  distribution,  the  relative  flow  distribution  and  the  energy 
analyzer  data  in  a striated  plasma  column.  (The  picture  is  taken  axially  to  show  the 
plasma  striations.)  As  suspected,  the  plasma  is  imbedded  with  flow  shears  both  in 
azimuthal  and  axial  directions.  When  we  apply  a potential  (via  concentric  grids  or 
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electrodes),  we  are  able  to  ignite  the  splitting  of  plasma  column  or  subdue  the  plasma 
oscillation  by  reversing  the  polarity  of  the  applied  potential.  It  is  to  be  noted  that  the 
application  of  a radial  potential  modifies  both  the  axial  and  radial  flow  patterns.  This 
is  due  to  the  fact  that  the  sum  of  conservative  (eE,  V p)  and  non-conservative  forces 
must  balance  each  other  in  an  equilibrium  plasma.  This  situation  induces  shear  fields 
in  the  plasma  column,  which  have  pronounced  effects  on  a Kelvin-Helmholtz  type  in- 
stability. The  oscilloscope  display  shows  the  quenching  and  growing  of  the  Kelvin- 
Helmholtz  type  instability  observed  in  the  striated  plasma  (see  Figure  2). 

The  electrostatic  Kelvin-Helmholtz  type  instability  was  previously  observed  in 
Q -machines,^’ ^ in  which  shear  flow  fields  were  created  either  by  imposed  mechanical 
walls  or  by  plasma  rotations.  Our  observation  implies,  however,  that  the  similar 
Kelvin-Helmholtz  type  instability  is  capable  of  splitting  plasma  sheets  into  striations 
as  the  instability  grows  into  weak  plasma  turbulence.  This  is  a rather  serious  implica- 
tion since  plasma  striations  were  observed  in  plasma  clouds  created  in  a flowing  neutral 
background  and  their  charge  sheets  were  subject  to  splitting  when  conditions  deviate 
from  the  uniformity.  One  must  be  aware  of  the  possibility  of  the  formation  of  striations 
when  the  plasma  is  subject  to  strong  flow  fields  and  shears  of  the  flow  fields  develop. 
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THERMAL  ANALYSIS  OF  VACUUM  VESSEL  BAKE-OUT  OF  FUSION  TEST  REACTOR 
K.  Chung 

The  torus  vacuum  vessel  of  a Tokamak-type  fusion  reactor  should  be  prepared 
for  the  very  high  level  vacuxim  operation.  Although  the  individual  components  are  pre- 
baked prior  to  assembly,  it  is  normally  required  to  perform  in-situ  bakeout  of  the 
vacuum  vessel  in  order  to  remove  impurities  from  the  inner  surfaces. 

There  exist  several  conditions  and  constraints  for  the  in-situ  bake-out  process: 

(1)  The  vacuum  vessel  must  be  heated  to  a temperature  above  250°C  (482°F). 
Depending  on  the  vacuum  performance  of  the  vessel,  the  required  bake-out 
temperature  can  be  as  high  as  500°C. 

(2)  Due  to  the  mechanical  stress  consideration,  the  temperature  differences  in 
the  vacuum  vessel  should  be  maintained  less  than  50°C  during  both  the  steady - 
state  and  transient  state. 

(3)  In  view  of  the  normal  operation  requirements,  the  heating  up  time  should  be 
limited  within  approximately  four  hours  and  the  steady -state  bake -out  time 
may  be  as  long  as  several  days. 

(4)  The  bake-out  process  must  be  repeatable  under  remote  maintenance  conditions, 
since  the  vacuum  vessel  will  become  radioactive  after  D-T  operation. 

(5)  The  hardware  required  for  the  in-situ  bake-out  should  be  comprised  with  the 
other  structures  of  the  test  reactor  and  the  bake -out  system  should  not  impair 
the  performance  of  the  device. 

There  exists  several  alternatives  for  the  bake-out  heating; 

(1)  Hot  gas  heating:  the  vacuum  vessel  is  heated  by  flowing  hot  gas  (helium, 
argon  or  air)  through  heat  pipes  surrounding  the  vacuum  vessel.  The  inner 
surfaces  of  the  vessel  are  heated  through  heat  transfer  from  the  heat  pipes. 

(2)  Hot  liquid  heating:  to  increase  the  heat  capacity  (and  thus  reduce  the  flow 
rate  and  temperature  gradient  along  the  heat  pipe)  hot  organic  fluid  can  be 
used  as  the  medium.  Again  the  heat  transfer  rates  determine  the  temperature 
profiles  of  the  vacuum  vessel. 

(3)  Heat  blanket:  as  in  normal  bake-out  processes  the  vessel  is  surrounded  by 
heat  blankets.  Heating  is  done  from  the  outer  shell  and  the  inner  shell  and 
structures  are  heated  via  conduction  and  radiation. 

(4)  Electrical  resistive  heating;  passing  electrical  currents  (ac  or  dc),  the  vacuum 
vessel  is  heated  to  the  bake-out  temperature.  The  electrical  current  is  sup- 
plied by  the  external  power  sources  via  terminals  attached  to  the  vacu'om 
vessel. 

(5)  Inductive  heating:  change  of  the  toroidal  magnetic  field  induces  poloidal  cur- 
rents in  the  vacuum  vessel.  Ohmic  heating  due  to  the  induced  poloidal  cur- 
rents can  be  used  for  heating  the  vacuum  vessel  for  bake -out. 

(6)  Plasma  discharge  heating:  large  amount  of  energy  can  be  deposited  to  the 
inner  walls  of  the  vacuum  by  the  radiation  and  particle  loading  from  the  dis- 
charges. Discharge  cleaning  is  itself  an  effective  method  for  vacuum  prepara- 
tion of  the  inner  walls  of  the  vessel.  It  can  be  also  used  to  heat  the  vacutim 
vessel  for  the  bake-out  process. 
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Assessing  these  alternatives  against  the  desired  conditions  and  constraints  and 
further  comparing  them  with  respect  to  spatial  requirements,  structural  simplicity  and 
economics,  hot  gas  heating  and  electrical  resistive  heating  are  chosen  as  the  primary 
candidates . 

Inductive  heating  is  considered  as  the  supplementary  system.  Hot  gas  heating  is 
attractive  since  it  can  be  combined  with  the  vessel  cooling  system.  For  heating  bellows 
and  internal  structures  directly  and  in  a programmed  way,  the  electrical  resistance 
heating  provides  the  most  flexible  scheme.  Since  the  induction  heating  can  be  applied 
without  any  additional  equipment,  it  is  available  fur  application.  Thus,  for  the  in -situ 
bake-out  process  of  the  vacuum  vessel,  hot  gas  heating  and/or  electrical  resistive  heat- 
ing are  the  acceptable  solutions  with  induction  heating  as  a backup  operation . 

The  design  of  the  sizes  and  layout  of  the  heat  sources  depends  on  the  thermal 
characteristics  of  the  vacuum  vessel.  Because  of  complicated  shapes  and  boundary 
conditions,  the  thermal  characteristics  of  the  vacuum  vessel  can  be  determined  only  by 
the  numerical  solution  of  the  time -dependent  heat  conduction  (radiation  convection)  equa- 
tion which  includes  heat  sources  externally  conducted  or  internally  generated.  For  this 
computation,  there  exist  several  computer  codes  and  we  have  chosen  LION  program 
developed  by  Knolls  Atomic  Power  Laboratory  for  CDC  computers.  In  addition  to  the 
solving  of  heat  conduction  in  structure  elements,  LION  may  also  be  used  to  solve 
problems  involving  forced  convection,  free  convection,  or  radiation  for  finding  temperatures 
and  heat  fluxes  on  the  surface.  The  LION  program  utilizes  a first  forward  difference 
method  based  on  a nodal  representation  of  the  geometry  through  the  evaluation  of  the 
equivalent  resistances  of  the  nodal  connectors. 

The  TFTR  vacuum  vessel  is  made  of  305  Stainless  Steel  whose  thermal  conductivity, 

specific  heat  and  electrical  resistivity  are  10.  2 Btu/hr/ft^/ft/°F  (at  392°F),  0.  12  Btu/ 

lb/°F  and  72  -cm.  The  physical  shape  and  thermal  analysis  model  are  described  in 

Figure  1.  The  vacuum  vessel  is  composed  of  sectionized  solid  wall  bellows  combinations 

which  are  connected  along  the  major  axis  of  the  torus.  Bellows,  whose  function  is  to 

increase  the  toroidal  resistance  of  the  vacuum  vessel,  are  also  made  of  thin  stainless 

steel  sheets  and  protected  from  both  inside  against  the  hot  plasma  and  outside  against 

pressure.  The  thickness  and  detailed  structure  of  the  vacuum  vessel  are  determined 

from  other  design  requirements  such  as  stress,  conductivity,  maintenance,  etc.  Once 

the  configuration  is  given,  we  may  determine  the  required  level  of  the  heat  generation 

for  the  bake -out  heating  using  the  LION  code.  An  example  of  the  computation  is  given 

in  Figure  2,  which  shows  the  time  dependent  temperature  profile  of  the  vacuum  vessel 

3 3 

for  electrical  resistance  heating  of  5.0  Btu/hr-in  in  solid  walls  and  0.  5 Btu/hr-in  in 
bellows.  The  result  is  acceptable.  This  particular  heat  generation  rate  can  be  achieved 
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Fig.  2.  Bake-out  resistive  heating. 


by  toroidal  electric  current  of  -.8kA  and  poloidal  electric  current  (in  the  solid  wall) 
of  -lOkA.  Such  electric  currents  can  be  easily  applied  by  external  power  sources. 


The  final  design  of  the  bake-out  heating  system  depends  on  more  than  the  thermal 
analysis  of  the  vacuum  vessel  alone.  It  is  required  to  incorporate  other  design  consid- 
erations such  as  cooling  system  during  the  discharge  cleansing,  needs  for  structural 
reinforcement,  protection  against  disruptive  discharges,  etc. 
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NEW  CONCEPTS  IN  LINEAR  PROPULSION 
E , Levi 

A novel  type  of  electric  drive  for  grovind  transportation  is  now  undergoing  inten- 
sive development.  This  scheme  divides  the  drive  motor  into  two  parts,  of  which  one 
is  carried  by  the  vehicle  and  the  other  lies  straight  along  the  track.  The  force  of  in- 
teraction between  these  two  structures  is  utilized  directly  as  tractive  effort,  without 
the  need  for  intermediate  transmission  or  gear. 

The  major  appeal  of  this  type  of  linear  electric  propulsion  is  that  it  promises  to 
overcome  the  limitations  of  the  wheel  in  high-speed  rail  transportation.  For  high- 
speed applications,  linear  versions  of  both  induction  and  synchronous  motors  of  con- 

1 -5 

ventional  design  are  being  studied. 

In  low-speed  applications,  however,  linear  motors  must  compete  with  their 
rotating  counterparts  and  are  handicapped  by  the  lack  of  thrust  multiplying  gears.  Hence, 
to  overcome  this  limitation,  some  means  must  be  devised  to  enhance  the  thrust  develop- 
ed per  unit  weight. 

A.  An  Unusual  Linear  Motor 

A new  invention,  the  "variable -reluctance  poly-phase  homopolar  converter," 
accomplishes  the  function  of  the  gear  by  magnetic  means.  The  motor  consists  of  a pas- 
sive ferromagnetic  rail  track  and  of  an  energized  structure  carrying  the  f>hase  windings 
of  a poly-phase  system  on  separate  magnetic  cores  located  on  board  the  vehicle. 

Figure  1 is  a photograph  of  a small-scale,  but  fully  operational,  model  of  the 
motor,  while  Fig.  2 is  a perspective  view  of  one  phase  winding,  its  magnetic  core,  and 
the  corresponding  section  of  the  passive  rail  track. 

The  thrust  develops  as  a result  of  the  tendency  of  the  teeth  in  the  energized  struc- 
ture to  align  themselves  with  those  of  the  rail  track.  The  phase  cores  are  solidly 
mounted  together  and  spaced  from  one  another  a distance  corresponding  to  the  phase 
angle  of  the  current  they  carry,  so  that  their  combined  effect  is  a traveling  wave. 

The  main  features  of  the  motor  are: 

(1)  The  thrust  is  proportional  to  the  number  of  teeth  and  can  be  made  practically 
independent  of  the  weight  of  the  motor 

(2)  A single  winding  carries  both  field  and  armature  excitation  resulting  in  high 
efficiency  and  low  weight  of  the  part  of  the  motor  carried  by  the  vehicle 

(3)  The  motor  develops  significant  forces  of  attraction  between  the  energized 
structure  and  the  passive  rail  track  --  forces  which  can  be  utilized  for 
magnetic  levitation 

(4)  The  low-profile  rail  track  shown  in  Fig.  1 makes  this  propulsion  system 
compatible  with  existing  rail  systems 
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Fig.  1.  Model  of  novel  propulsion  system  for  light-rail  transit. 


REPETITIVE  PHASE 
ARRANGEMENT 


NOTE:  Proposed 
design  does  not 
use  raised  teeth 
on  rail  (i.  e.  , in- 
ground  strip). 


Fig.  2.  Variable  reluctance  polyphase 
homopolar  converter. 

Since  high  values  of  thrust  weight  ratios  are  easily  attainable,  this  motor  is  par- 
ticularly suited  to  relatively  low -speed/high  tractive  effort  applications,  such  as  rapid 
transit  and  light-rail  transit. 
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B.  A Collectorless  Electric  Bus 

Another  invention,  the  collectorless  electric  bus,  is  aimed  at  overcoming  the 
serious  drawbacks  of  buses  propelled  by  internal  combustion  engines,  i.e.  , (a)  low  ef- 
ficiency, (b)  consumption  of  critical  energy  resources,  (c)  noise,  and  (d)  chemical 
pollution. 

The  propulsion  system  consists  of  a low-voltage  energized  track  which  is  emplaced 
in  the  road  pavement,  flush  with  its  surface,  and  a passive  electromagnetic  structure 
carried  by  the  vehicle.  Flexible  mounting  of  this  electromagnetic  structure  on  board 
a rubber-tired  vehicle  would  allow  its  lateral  displacement  up  to  a full  bus  width  from 
the  track  (see  Figures  3 and  4).  As  a result  the  bus  can  maneuver  in  traffic  and  pull 
to  a curb. 


Linear  electric  motor 

NOTE:  Cut-away 
shows  motor 
secondary  mounted 
on  a below  vehicle 
runway  for  lateral 
mobility. 


Fig.  3.  Collectorless  electric  bus. 


It  should  be  noted  that  there  exists  no  physical  contact  between  the  power  source 
and  the  vehicle.  The  power  transfer  is  via  induction.  The  gap  separating  the  vehicle 
and  roadbed  parts  of  the  motor  is  approximately  half  an  inch  (see  Figure  4).  The  forces 
of  attraction  between  the  primary  and  secondary  structure  are  utilized  to  maintain  the 
necessary  alignment  and  guide  the  movable  part  along  the  fixed  one. 

The  structurally  simple  and  sturdy  primary  consists  of  straight  conductors  laid 
parallel  along  the  direction  of  motion,  and  intertwined  with  laminated  iron  cores.  The 
view  from  below  in  a small-scale  model  is  shown  in  Figure  5. 
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IN-GROUND 

ENERGIZED 

TRACK 


MOTOR  SECONDARY 


(a)  Design  1:  electric  motor  secondary  sliding  on  runway. 


IN-GROUND 


NOTE:  Permitted 
degrees  of  motion 
shown  by  arrows. 


(b)  Design  2:  electric  motor  secondary  with  swinging  arm, 
arm  sliding  through  rotating  fixed  head. 


Fig.  4.  Two  designs  for  mounting  of  propulsion 
system  on  collector  less  electric  bus. 


Fig.  5.  Aluminum  conductors  and  ferro-magnelic 
cores  viewed  from  underside. 
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Preliminary  calculations  have  been  made  for  a typical  vehicle  with  a passenger 
capacity  of  53  seats  and  40  standees.  Two-inch  diameter,  aluminum  conductors  will 
suffice  to  provide  an  acceleration  rate  of  2.7  mf  ;;ps  (1.2  m/sec  ) and  speeds  well  in 
excess  of  the  practical  limit  of  50  mph. 

The  secondary  circuit  includes  a thyristor  rectifier -inverter  and  a battery.  A 
battery  on  board  the  vehicle  would  be  required  in  any  case  to  provide  auxiliary  power. 
Insertion  of  the  battery  in  the  secondary  circuit  endows  the  induction  motor  with  the 
properties  of  a doubly  excited  machine.  It  thus  becomes  possible  to  control  the  speed, 
the  acceleration  and  deceleration,  and  even  the  direction  of  energy  flow,  exclusively 
from  the  vehicle  side.  Moreover,  forward  and  backward  motion,  as  well  as  braking 
can  be  achieved,  even  with  loss  of  primary  power,  by  operating  the  motor  as  a variable 
reluctance  machine.  The  charge -discharge  cycle  of  the  battery  accomplishes  a leveling 
of  the  peak  in  demand.  This  is  an  important  consideration  with  regard  to  energy  con- 
servation. 

When  compared  with  other  alternatives  which  have  been  considered  for  the  exist- 
ing urban  buses,  the  collectorless  electric  bus  offers  the  following  advantages: 

(1)  It  can  maneuver  up  to  about  eight  feet  laterally,  a feat  which  tracked  vehicles, 
such  as  trolley  cars  cannot  perform 

(2)  It  needs  no  unsightly  overhead  wires  from  which  trolley-buses  inconveniently 
become  un-tracked 

(3)  It  requires  no  technological  break-throughs,  as  is  the  case  with  electric 
battery  and  kinetic  energy  buses. 

Moreover,  the  propulsion  system  for  the  collector  less  electric  bus,  as  well  as 
the  variable -reluctance  poly-phase  homopolar  converter  described  above  may  permit/^ 
the  realization  of  many  of  the  advantages  of  linear  motors  over  their  rotating  countei-- 
parts.  These  are:  4 


(1)  Increased  adhesion  force  with  consequent  feasibility  of  faster  acceleraUon 

and  deceleration  / 

f 

(2)  Improved  ride  dynamics  and  comfort  (of  benefit  to  elderly/handicappe;d  riders) 

(3)  Regenerative  braking  and,  in  general,  higher  efficiency  J 

(4)  Greater  safety  resulting  from  the  presence  of  an  additional  electric  *braking 
system  which  does  not  rely  on  the  wheel  or  rail 

(5)  Noise  abatement,  as  a result  of  the  absence  of  gears,  reduced  flat/wheels 
(no  tread  braking  or  locked  wheel  sliding)  and  reduction  in  rail  wear  and  tear 

(6)  Lower  maintenance  costs  of  propulsion  and  braking  systems  and  tracks 

(7)  Lower  pol'ution  from  brake  shoes  (cast  iron,  asbestos). 
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magnetostatic  POTENTIAL  AND  MAGNETIC  FLUX 

armature  of  homopolar  inductor  machines 


DENSITY  ALONG 


THE 


E.  Levi  and  M.S.  Gemelos 


A new  method  for  the  determination  of  static  field  distributions  in  complex  geom- 
etries and^with  arbitrary  field  distributions  was  developed  in  previously  reported 
studies.  This  method  was  first  applied  to  homopolar  inductor  type  machines,  under 
the  assumption  that  the  armature  surface  is  smooth  and  that  the  armature  excitation 
consists  of  a sinusoidal  current  sheet.  In  reality  the  armature  conductors  are  placed 
in  discrete  slots  and  the  examination  of  iron  saturation  effects  requires  a more  detailed 
description  of  the  physical  structure. 

The  aim  of  this  study  is  the  determination  of  the  magnetic  flux  density  and  of  the 
magnetostatic  potential  along  a plane  lying  on  the  armature  tooth  tips.  The  problem  is 
assumed  to  be  two-dimensional,  the  current  carrying  conductors  are  removed  and  the 
slots  assumed  to  have  infinite  depth,  so  that  conformal  mapping  procedures  and  in 
particular  the  Schwarz-Christoffel  transformation  can  be  used.  The  armature  teeth 
are  assigned  appropriate  potentials  with  respect  to  the  pole  face  located  on  the  other 
side  of  the  air  gap. 

To  gain  more  insight,  a step  by  step  approach  was  adopted.  First  a simple  con- 
figuration is  examined  containing  an  isolated  slot.  Then,  more  involved  schemes,  such 
as  a succession  of  open  slots  are  studied.  Taking  advantage  of  the  fact  that  linearity 
holds  in  the  air  gap,  the  field  and  armature  excitation  are  also  examined  separately. 

The  simplest  case  is  the  classical  Carter  problem^  of  an  isolated  slot  with  field 
excitation  only.  The  complicating  factor  is  that  interest  is  focused  here  on  the  flux 
distribution  at  the  surface  of  the  slotted  armature,  rather  than  at  the  smooth  surface 
across  the  gap.  It  is  found  that  the  perturbation  caused  by  the  presence  of  the  slot  dies 
out  a short  distance  away  along  the  gap. 

The  case  of  adjacent  slots,  also  with  field  excitation,  yields  an  exact  solution  in 
terms  of  the  Jacobian  elliptic  functions.  The  flux  density  and  the  potential  along  the 
armature  surface  are  given  in  Figure  1.  A remarkable  result  is  that  in  the  case  studied 
in  which  the  ratio  of  gap  length  over  the  slot  width  is  two,  the  field  under  the  slot  is  as 
strong  as  the  field  under  the  tooth  face.  This  is  due  to  the  large  ratio  of  gap  length 
over  slot  width.  When  the  slot  is  very  deep,  changes  in  the  gap  length  cannot  affect 
significantly  the  field  strength  under  the  slot,  while  the  field  under  the  tooth  is  strongly 
affected  by  these  changes. 

With  armature  excitation  and  in  contrast  with  the  case  of  field  excitation,  it  is 
found  that  the  perturbation  caused  by  an  isolated  slot  dies  away  slowly  along  the  gap. 

The  interaction  between  adjacent  slots  then,  becomes  important. 
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Fig.  I.  Field  excitation 
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With  a succession  of  teeth  at  different  potentials  a rigorous  solution  becomes  too 
cumbersome  because  of  the  lack  of  symmetry.  However,  since  the  field  is  required 
only  along  the  armature  surface  and  not  within  the  gap  and  since  the  armature  excitation 
results  in  a discrete  stepwise  increase  of  the  magnetostatic  potential  from  one  tooth  to 
the  adjacent  one,  a solution  can  be  obtained  by  treating  only  a couple  of  slots  and  by  us- 
ing superposition. 


Plots  of  the  potential  and  B -field  along  the  armature  surface  for  the  elementary 
two-slot  configuration  are  shown  in  Figure  2.  It  is  found  that  the  perturbation  caused 
by  the  slot  dies  out  faster  than  in  the  case  of  an  isolated  slot.  This  results  in  a more 
uniform  field  under  the  tooth  face,  albeit  at  a lower  value  than  the  asymptotic  one. 

Finally,  the  typical  case  of  a succession  of  three  slots  belonging  to  the  same 
phase  winding  is  shown  in  Figure  3.  The  potential  and  B -field  distributions  are  obtained 
by  superimposing  the  plots  of  Fig.  2,  multiplied  by  the  factors  2,  1 and  -1  over  the 
regions  of  teeth  at  potentials  2 <|)q,  <|)q  and  respectively.  The  pole  surface  is  taken 

at  zero  potential. 


These  results  are  in  good  agreement  with  plots  obtained  by  analogue  simulation 
using  a Pasco  Scientific  Equipotential  and  Field  Gradient  Mapper. 
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Fig.  2.  Armature  excitation  (for  two  slots). 


Fig.  3.  Armature  excitation  (for  a succession  of  slots). 
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In  further  studies  iron  saturation  effects  will  be  taken  into  account  by  modifying 
the  magnetostatic  potentials  'n  an  iterative  procedure. 
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NEW  RESULTS  IN  ELECTRICAL  MACHINE  THEORY:  FORMUIJVS  OF  THE  B^SIUS 
TYPE  FOR  THE  CALCULATION  OF  FORCE  AND  TORQUE  ON  THE  *^°TOR  OF  A 
CYUNDRICALLY  UNIFORM  ELECTROMAGNETIC  MACMNE  VIEWED  AS  AN  AERO- 
FOIL IMMERSED  IN  A MAGNETIC  FLUID 

D . C . Y oula 

Consider  a magnetized  current-carrying  ferromagnetic  body  separated  from  all 
other  sources  by  a free-space  region.  As  is  well  known, ' the  expressions 


F = — f H^tda 
- 2 Jg,  - 


(1) 


and 


T = S f nrxt  da 

- 2 Jg/  - > 


(2) 


give  the  correct  values  for  the  total  external  magnetostatic  force  and  torque  exerted  on 
the  complete  body  provided  the  orientation  of  the  unit  vector  t with  respect  to  the  mag- 
netic intensity  H and  the  positive  normal  n erected  at  da  is  as  shown  in  Figure  1.  (S 
is  any  closed  surface  in  the  source-free  region  which  encloses  the  body,  H = 1h|  and 
r is  the  radius  vector  of  the  element  of  area  da  from  the  origin  of  coordinates.  The 
surface  can  be  chosen  coincident  with  S if  it  is  understood  that  H is  the  magnetic 
intensity  just  above  da  and  outside  of  V . ) 

Let  us  now  assume  that  the  problem  is  2-dimensional  and  the  body  possesses  uni- 
form cross-section  in  the  z-direction.  Then  H=  H(x,y)  and  k-  H=  0.  In  the  source- 
free  region  V xH=  O and  there  exists  a scalar  function  u(x,y)  such  that 

H=-Vu(x,y)  . 

Since  V • H=  0, 


3H 


9H 


3x  9y 


-=  0 


(4) 


and 


V^u(x,y)=0  . 

Thus  u(x,y)  is  harmonic  (but  not  necessarily  single -valued) . Along  a streamUne  of  H, 

(6) 


H dx  - H dy  = 0 
y x 


However,  as  a consequence  of  Eq.  (4), 


H dx  - H dy  = exact  differential  = dv(x,y)  - 0, 
y X 


(7) 
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Fig.  1 

v(x,y)  a scalar  function.  Hence,  the  conjugate  function  v(x,  y)  is  constant  along  a 
streamline  of  H and 

3V  _ rr  _ ^ 

^ ■ y ■ ■ 3y  ’ 

(8) 

9v  _ „ _ 9^ 

^ X 9x 

But  these  are  precisely  the  Cauchy -Riemann  equations  which  connect  the  real  and  imag- 
inary  parts  of  the  analytic  function  W = u+ jv  of  the  complex  variable  z=  x+ jy.  Since 
W(z)  is  analytic, 

dW  _9u  .^_9u_-^  /g\ 

dz  ” 9x  ^ ^ 8x  ~ 9x  * 3y 


= -(H^-jHy)=  -h" 


(10) 


This  z should  not  be  confused  with  the  space  variable, 
complex-conjugate  of  a. 


As  usual,  a denotes  the 
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where 


H = H + jH 
— X •’  y 

is  the  complex-plane  representation  of  H.  Under  the  substitutions 
r -»  z , 
dr  -»  dz  , 

tda  ->  e'^^dz  , 

rxtda  ->  Imag(z  e"*  dz)  k , 
n da  -»  jdz  , 

it  is  evident  that  for  unit  axial  length 


F -»  F = 


H H 

^C' 


•%jY^z 


and 


k • T T = -Imag  ^ H H 
cp  K 2 \ 


(11) 


(12) 


(13) 


(14) 


(C'  is  the  boundary  of  the  xy-projection  of  the  3 -dimensional  surface  ) From  the 
geometry  of  Fig.  1, 

e'^'^H  dz  = -jH(dz) 
whence, 

and 

T = Re  ^ (^)^zdz  . (17) 

CO  2 / dz  ' ' 


(15) 

(16) 


Consequently,  all  force  and  torque  calculations  in  cylindrically  uniform  ferromagnetic 

systems  can  be  carried  out  with  the  aid  of  the  same  Blasius  formulas  used  so  effective - 

3 

ly  in  the  theory  of  aerofoils.  Furthermore,  and  this  is  very  significant  from  a practical 
point  of  view,  Eqs.  (16)  and  (17)  are  valid  even  for  saturated  and  non -infinitely -permeable 


Re  a and  Imag  a stand  for  the  real  and  imaginary  parts  of  a,  respectively. 
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iron.  It  should  now  be  possible  to  attack  the  problem  of  designing  optimal  rotor  profiles 
with  the  aid  of  the  theory  of  residues.  Any  new  results  in  this  direction  will  be  report- 
ed at  the  appropriate  time. 
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STATIC  TESTS  ON  THE  CLAW  POLE  LINEAR  SYNCHRONOUS  MOTOR 

E.  Levi,  L.  Birenbaum,  P.  Chan,  D,  Fontana,  F.  Lalezari,  J.  Mullen,  N.  Rokkos, 

D.  Wong 

Static  tests  on  the  Nadyne  motor  are  possible  because  the  motor  is  of  the  synchro- 
nous type  i.  e.  , static  tests  on  the  linear  induction  motor  are  not  possible  because  the 
developed  forces  are  due  to  the  induced  currents.  An  idealized  sketch  of  the  motor 
and  a typical  magnetic  flux-path  are  shown  in  Figure  1. 


FIELD  YOKE 


ARMATURE 

WINDING 


PASSIVE  RAIL  / 
TRACK  ' 


Fig.  1.  Idealized  sketch  of  claw-pole  linear 
synchronous  motor. 


The  field  winding  (on  the  vehicle  section)  is  excited  with  direct  current,  which 
produces  alternate  north  and  south  magnetic  field  poles  in  the  part  of  the  track  under- 
neath the  vehicle.  In  order  to  perform  static  tests  on  the  motor,  the  armature  coils 
also  are  excited  with  direct  current.  During  normal  operation,  polyphase  AC  would  be 
used.  Because  of  the  way  the  armature  coils  are  wound,  a suitable  choice  of  DC  cur- 
rents to  feed  the  separate  phases  gives  rise  to  a static  magnetic  field  varying  sinusoidal- 
ly (approximately)  along  the  armature.  Figure  2 is  a schematic  diagram  showing  the 
vehicle  core  used  for  the  model,  including  important  dimensions.  Figure  3 is  a photo- 
graph of  the  salient  pole  track  structure.  The  "claw"  type  interdigital  arrangement  of 
the  poles  is  clearly  visible.  The  interaction  of  the  fields  created  by  currents  in  the  field 
winding  and  in  the  armature  winding  results  in  a horizontal  thrust  and  a vertical  levita- 
tion force. 


The  main  purpose  of  the  tests  now  to  be  described  was  to  measure  the  magnetic 
fields  in  the  air  gap  of  the  motor,  and  to  measure  the  corresponding  thrust  and  lift  forces 
caused  by  their  interaction. 

A.  Details  of  the  Force  Measurements 


The  magnetic  fields  were  measured  using  a Bell  gaussmeter.  In  order  to  measure 


rmature 
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the  horizontal  and  vertical  forces,  the  motor  was  suspended  in  an  aluminum  retaining 
frame  (Fig.  4)  and  the  forces  exerted  on  the  track  (rather  than  on  the  motor)  were 
measured  using  horizontal  and  vertical  load  transducers. 


Fig.  4.  View  of  measuring  jig  for  nadyne  static  model.  The 
motor  (M)  is  positioned  under  the  track  (T).  Both 


are  housed  within  an  aluminum  frame. 


In  an  actual  dynamic  model  of  the  motor,  the  armature  winding  would  be  energiz- 
ed with  three  phase  alternating  currents.  In  order  to  perform  the  static  tests,  the 
armature  windings  were  energized  with  DC  currents,  instead,  to  simulate  an  instantan- 
eous snap  shot  of  conditions  in  the  real  motor.  For  the  measurement,  three  different 
configurations  were  used,  representing  three  instants  of  operation,  and  intended  to 
simulate  three  different  mmf  distributions  along  the  vehicle.  These  three  conditions  j 

were  chosen  120°  apart  in  time.  One  of  these  is  shown  in  Figure  5.  ! 

The  magnetic  field  and  force  measurements  were  done  as  follows:  first,  with  the 
field  winding  alone  activated;  then,  with  the  armature  winding  alone  activated;  finally, 
with  both  circuits  activated.  These  measurements  made  it  possible  to  examine  the  ,j 

linearity  of  the  system  (i.e.  , whether  the  B fields  produced  by  the  field  and  armature  I 

windings  separately  were  really  additive),  and  to  identify  clearly  the  contributions  of  I 

each  winding  to  the  net  field  and  force  in  each  case.  It  was  important  to  know  whether  j 
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Fig.  5.  Sketch  illustrating  rationale  for  DC  excitation  of  armature, 
(a)  vehicle  and  rail  structure;  (b)  currents  in  armature 
winding;  (c)  current  distribution  along  armature;  (d)  mmf 
distribution  along  armature. 


linearity  was  present  if  a meaningful  comparison  of  the  measurement  with  theory  was 
to  be  made.  Results  of  a typical  B field  measurement  are  shown  in  Fig.  6,  in  which 
near-linearity  is  apparent. 


All  force  and  field  measurements  were  done  using  2 A DC  for  the  field  winding  and 
1. 5 A DC  for  the  maximum  current  in  the  armature  winding.  This  represents  a sinu- 
soidal armature  current  of  1 . 5/VTA  r . m.  s . 

The  developed  thrust  or  levitation  depends  on  the  relative  position  of  the  peak  of 
the  magnetic  field  generated  by  the  armature  winding  with  respect  to  the  poles  of  the 
track.  When  the  peak  of  the  armature-generated  magnetic  field  directly  faces  the  center 
of  one  of  the  track  poles,  as  in  Fig.  6,  it  will  create  either  maximum  or  minimum  levita- 
tion, depending  on  polarity.  Maximum  thrust  is  achieved  when  the  peak  of  the  magnetic 


nnnnnnnnnnnnn 

ARMATURE  TEETH  - VEHICLE 

Fig.  6.  Typical  measurement  of  B field  in  air  gap  of  nadytie  motor 
Conditions:  Gap  - 0.  475";  armature  current  - 1. 5 A DC; 

field  current  - 2 A DC;  pole  pitch  - 1,5";  con- 
figuration of  Cl  rrents  in  phase  windings  - see 
Fig.  5;  vehicle  positioned  for  maximum  lift  force. 

field  lies  halfway  between  two  neighboring  poles.  The  resulting  thrust  would  be  either 
positive  or  negative  depending  on  the  polarity  (i.e.  , a positive  thrust  would  tend  to  move 
the  motor  to  the  right;  a negative  thrust  to  the  left).  For  each  of  three  configurations 
representing  three  different  mmf  distributions  along  the  armature,  field  measurements 
were  taken  for  four  different  relative  positions  between  the  armature  and  the  track: 
for  maximum  positive  and  negative  thrust,  and  for  maximum  and  minimum  levitation. 

All  of  the  field  measurements  were  taken  using  the  same  gap,  0.475  inches,  just 
large  enough  to  permit  the  insertion  of  the  Hall  probe  of  the  gaussmeter.  Force  meas- 
urements were  taken  for  gaps  ranging  from  3/l6  to  6/ 1 6 inch,  in  increments  of  l/l6 
inch. 

B.  Comparison  of  Predicted  and  Measured  Forces 

Calculation  of  the  thurst  and  normal  forces  yields  for  the  3/8  inch  gap: 

Thrust  - T = 0.  61  sin  y - .031  sin  2 y lbs; 

Lift  - F = -63. 2 + 0.  91  cos  y - . 024  cos  2 y lbs. 
n 

In  Fig.  7 is  shown  a comparison  of  this  prediction  with  the  measured  thrust  and  lift 
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(b)  LIFT  Vi  DISPLACEMENT  (3/8" GAP) 
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Fig.  7.  Comparison  of  measured  and  predicted  (a)  thrust  and 
(b)  lift  forces  in  nadyne  model. 

forces.  The  degree  of  correspondence  between  the  two  is  really  a measure  of  how  use- 
ful is  the  theory  of  the  nadyne  in  describing  a real  situation.  The  theory  is  based  on 
an  endless  vehicle,  and  hence  does  not  take  into  account  end  effects,  absent  in  a rotating 
machine;  nor  does  it  include  detailed  consideration  of  effects  of  armature  teeth  or  of 
fringing  flux  at  the  pole  edges.  In  addition,  there  is  still  present  in  the  model  some 
degree  of  tilt  due  to  the  imperfect  rigidity  of  the  track  suspension.  Although  the  theory 
includes  intelligent  estimates  of  the  leakage  flux  effects,  no  precision  can  be  claimed 
here.  When  all  of  these  factors  are  considered,  the  underestimation  of  the  thrust  force 
(by  about  25%)  and  the  overestimation  of  the  lift  force  (by  about  30%)  represent,  perhaps, 
reasonably  good  predictions. 
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THE  CLASS  OF  LEHMANN  ALTERNATIVES  AS  MEANS  FOR  EVALUATION  OF 
PERFORMANCE  IN  ROBUST  DETECTION 

B.  Gotz  and  L.  Kurz 

Numerous  problems  in  communications,  radar,  sonar,  pattern  recognition  and 
picture  processing  may  be  posed  in  terms  of  hypotheses  testing  of  stochastically  or- 
dered hypotheses.  The  most  commonly  used  model  of  shift  in  the  mean  (change  in 
location)  is  an  example  of  such  ordering.  In  Ref.  1,  Lehmann  proposed  a new  and  use- 
ful method  of  realizing  an  alternative  that  is  stochastically  larger  than  a given  hypothesis. 
For  a particular  hypothesis  distribution  F(x)  he  considers  an  alternative  distribution 
G(x)  = h(F(x)),  where  h is  some  transformation  on  F(x)  such  that  G(x)  < F(x),  i.e.  , the 
alternative  hypothesis  is  stochastically  larger  than  the  null  hypothesis.  A particularly 
simple  transformation  that  is  proposed  in  Ref.  1 and  that  is  utilized  here  is 

G(x)  = F^'*'®(x)  e > 0 (1) 

The  stochastic  ordering  property,  G(x)  ^ F(x),  follows  from  the  fact  that  F(x)  is  a 
c.d.f.  The  class  of  alternatives  specified  by  Eq.  (1)  will  be  referred  to  as  the  class 
of  Lehmann  alternatives.  The  Lehmann  alternatives  have  a special  analytic  convenience 
for  evaluation  of  performance  in  robust  detection.  For  instance,  the  conceptually  use- 
ful efficacy  (local  rate  of  change  in  signal-to-noise  ratio)  becomes  for  many  detectors 
independent  of  the  particular  reference  distribution  F(x).  There  remains  the  major 
question  of  relevance.  In  this  connection  it  should  be  pointed  out  that  in  distribution - 
free  detection  procedures  the  precise  nature  of  the  hypothesis  separation  will  not  be 
likely  known.  "What  is  then  required  are  alternatives  representative  of  principal  types 
of  deviation  from  the  hypothesis,  in  terms  of  which  one  can  study,  at  least  in  outline, 
the  ability  of  various  tests  to  detect  such  deviations"  (p.  24  of  Reference  1).  The  effort 
here  will  be  concentrated  on  how  Lehmann  alternatives  compare  to  other  kinds  of 

hypothesis  separations  and  on  the  nature  of  the  efficacy  calculations  for  these  alterna- 

2 

tives  in  m -interval  detectors. 

A.  Lehmann  Alternatives  vs.  Other  Kinds  of  Hypothesis  Separation 

The  question  is:  how  does  one  relate  the  Lehmann  indexing  parameter  0 in 

H^  : F(x)  Hj;F^‘''®(x) 

to  other  indexing  parameters? 

The  change  in  the  mean  and  variance  is  a natural  measure  of  hypothesis  separa- 
tion. Under  tne  hypothesis  with  the  c.d.f.  F(x)  and  p.d.f.  f(x) 
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Ho  : h[x/H^]  = » «»)  <i* 

VHo  ] 

and 

: f‘‘^®(x)  = G(x),  g(x)  = (1+  0)  F®(x)f(x),  0 > 0 
the  change  in  the  mean  is 

Am  = J X 1 + 0)  F®(x)  - ljf(x)dx 

and  the  change  in  variance  is 

Ao^  = [x  - (Am  + E(x/H^))]^  (1  + 0)  F®(x)  f(x)  dx  - 

For  a specific  distribution  the  evaluation  of  the  above  integrals  as  a function  of  0 is 
generally  difficult  and  one  must  resort  to  numerical  methods.  In  Fig.  1 results  of 


Fig.  1.  Change  in  mean  and  standard  deviation 
accompanying  a Lehmann  alternative 
f1  + 0(x)  where  F(x)  = N(0,  1). 
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numerical  calculations  based  on  a three-point  Gaussian -Her mite  quadrature  formula 
for  the  reference  c.d.f.  N(0,1)  are  shown.  Similar  calculations  are  readily  carried 
out  for  other  reference  distributions. 

A more  compact  comparison  that  measures  hypotheses  separation  is  the  unweight- 
ed change  in  area  between  the  hypothesis  and  alternative  c.d.f.  , i.e.  , 

00 

d(e)  = / [F(x)  - F (x)]dx 

.00 

Since 

00 

d(0)  < J F(x)  dx 
-00 


and  if 

Urn  [F(x)  - F (x)]  = F(x) 

9->0  ® 

as  is  true  for  any  physically  meaningful  indexing,  it  follows  that 
00 

lim  d(0)  = J F(x)  dx 
0-^0  -00 


Both  shift  of  the  mean  and  Lehmann  alternatives  satisfy  this  condition  with  the  separation 
expressions,  d(0),  of  the  form 


J [F(x)  - F(x  - 0)]  dx  and  J [F(x)  - F^^®(x)]dx 


F(x). 
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The  special  convenience  of  the  separation  measure,  3(6),  is  the  fact  that  for  Lehmann 

1+6 

alternatives  of  the  type  F.(x)  = F (x) 

130  1 

3(0)=  f [F(x)  - F^‘*'®(x)]f(x)  dx  = f (y  - y^^®)dy  = y - 

-00  0 <i  + e 


independent  of  the  reference  c.d.f.  F(x).  For  another  type  of  hypothesis  indexing  and 
some  referenfe  c.d.f.,  3(9)  need  be  calculated  and  compared  to . 

A comparison  of  d(9)  for  shift  alternatives  and  standardized  Cauchy  and  Gaussian 
c.d.f.  's  to  Lehmann  alternatives  is  shown  in  Figure  2.  Calculations  were  performed 
using  a 4-point  Gaussian-Hermite  quadrature  formula.  The  results  of  Fig.  2 imply 
that  3(9)  for  Lehmann  alternatives  represents  a conservative  bound  on  the  separation 
measure  for  shift  alternatives. 


,d(5)«  f [FU)~  FeU)]fix)d» 


SHIFT  ALTERNATIVE,  F(«)  (0.1) 


SHIFT  ALTERNATIVE,  CAUCHY 

l + K*^ 

0.5 


LEHMANN  ALTERNATIVE  IND.  OF  F(«) 


I 2 3 4 5 6 g 

Fig.  2.  A hypothesis  separation  measure 

compared  for  various  alternatives. 


B.  Lehmann  Alternative  Efficacy  Calculations 


H : F (x)  = F(x) 
o o ' 


246 


COMMUNICATIONS 


: F^(x)  = = g(0)  = Fg(x),  0 > 0 


By  the  mean  value  theorem 

g(0)  = g(0)  + 0g'(0)  0 < 0 < 0 

and,  assuming  that  g*(0)  is  continuous  in  0,  for  small  0 
g(0)  i g(0)  + 0g'(O) 
where  g(0)  = F(x)  and 

g'(0)  = [in  F(x)]  ^^"‘^10=0  = 

For  small  0 


FJx)  i F(x)  + 0 F(x)  in  F(x) 
u 


(2) 


For  m -interval  detectors,  the  test  statistic  for  efficacy  calculations  is 
m 


T = ^ V b.n. 
L .u  1 1 
1=1 


where  L is  the  number  of  samples,  m the  number  of  partitions,  b^  the  scores  and  n^  are 
occupancy  indicators  (number  of  samples)  between  two  adjacent  quantiles.  It  can  be 
shown  that  the  conditional  expectation  reduces  to 


:[t/Hj]=  e[t/hJ+  0 J^b.[F(a.)in  F(a.)  - F(a._j)in  F(a._j)] 


(3) 


where  [a^  } are  the  quantiles  of  F(x). 


Similarly,  it  can  be  shown  that 

4/h^  = i: 


(4) 


Since  the  efficacy  of  a detector  T for  the  reference  distribution  F(x)  is  defined  by 


iSr-  - = lim 


p9E(T/Hj) 

f 

L S0 

0=0-^ 

'T,F 


(5) 


L-><» 


L oi 


T/H^ 


it  follows  from  Eqs.  (3-5)  that 


I 

I 

J 
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T,  F m 


C m r 

b.LF(a.)in  F(a.)  - F(a._j)in  F{a._j)| 
V^b^^[F(aj)  - r(aj.,)]  . {X^b^[ru,)  . 


(6) 


Of  particular  interest  is  the  special  case  of  equiprobable  partitioning  with 

F(a^)  - F(a^  P “ m ~ j ~ ^ • ’ ' ' > equiprobable  partitioning, 

Eq.  (6)  reduces  to 


T,  F 


, m -.2 

i Z in  j - (j  - I)  in(j  - 1)  - in  m]  j 

= J=^ 

m m o 

-%v 


(7) 


As  was  shown  in  Refs.  2 and  3,  for  the  optimal  m -interval  detector 

bj.i»{m[F,{.,)-F,Cj,j)]} 

which  reduces  for  small  signal-to-noise  ratio  (i.e.  , locally)  to 


K = 0[i  in  i - (i  - 1)  in(i  - 1)  - in  m] 


(8) 


where  use  has  been  made  of  the  approximation  in(l  + | ) = | for  small  ^ . From  Eqs. 
(7)  and  (8),  it  follows  that  for  Lehmann  alternatives  and  equiprobable  partitioning,  the 
efficacy  reduces  to 

<?T  p = Z - 1)  - in  m)  (9) 

’ j=l 

which  is  independent  of  the  reference  c.d.f.  F(x). 

After  some  mathematical  manipulations,  it  can  be  shown  that  the  limiting 
Lehmann  alternative  efficacy  expression  for  the  locally  most  powerful  m -interval 
detector  is 

lim  (S_  = 1 (10) 

m-»oo  ’ 


The  result  of  Eq.  (10)  must  agree  with  the  general  local  efficacy  bound 

2 1 

0x,Fi“  "'e.o' 


0=0 


} 


(11) 


In  addition,  as  for  the  shift  alternative,  it  is  expected  that  as  m-»oc,  the  efficacy  of  the 
locally  most  powerful  m -interval  detector  approaches  the  efficacy  of  the  locally  most 
powerful  detector  on  the  raw  data,  i.e.  , lim  E_  „ = Inf  F__„,  which  would  require 

m -^oc  lit  0-  U 
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^6=0  = ‘ 

independently  of  F(x).  This  indeed  is  the  case  because 
Fg(x)  = F^'^V)  -*  fg(x)  =0  + 0)  F®(x)  f(x) 

-*  in  IqM  = in(  1 + 0)  + 0 f n F(x)  + in  f(x) 


and 


2 

^0=0=  + ‘ 

0=  0 


Thus,  it  can  be  concluded  that  for  Lehmann  alternatives 
^T,F  - ^ 

It  should  be  indicated  at  this  point  that  in  comparing  detectors  using  asymptotic  relative 
efficiency  by  comparing  efficacies,  one  must  restrict  oneself  to  a specified  alternative 
hypothesis  indexing.  Comparing  efficacies  with  different  alternative  indexing  is  mean- 
ingless. 


B.  Gotz  and  L.  Kurz 
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SIGNAL  DESIGN  AND  RANK  M-ARY  DETECTION 
Y.C.  Chinp  and  L.  Kurz 

In  this  report,  M-ary  nonparametric  detectors  are  considered  with  the  associated 
signalling  alphabets.  Unlike  the  class  of  parametric  detectors  for  which  the  classifi- 
cation procedures  are  based  on  the  optimum  partitioning  of  the  sample  space  associated 
with  the  received  signals  corrupted  by  noise  of  known  distribution,  the  nonparametric 
detectors  are  designed  by  first  selecting  a nonparametric  partitioning  of  the  sample 
space.  In  turn,  the  latter  selection,  motivated  primarily  by  the  type  of  limited  knowl- 
edge available  about  the  noise  distribution  and  by  the  simplicity  and  flexibility  of  the 
resulting  detector  structure,  governs  the  selection  of  signalling  alphabets.  This  type 
of  systems  are  particularly  useful  in  a severe  noise  environment  and,  because  of  their 
M-ary  nature,  these  systems  utilize  the  available  bandwidth  efficiently.  Though  M- 
ary  nonparametric  detectors  were  considered  previously,  ’ in  this  report  the  problem 
of  joint  detector  and  signal  selection  based  on  nonparametric  partitioning  of  the  space 
is  considered  for  the  first  time.  In  particular,  the  nonparametric  detectors  which  are 
based  on  the  rank  partitioning  (rank  detectors)  are  considered.  For  the  rank  detectors, 
a set  of  signal  vectors  which  are  permutations  of  one  another  is  selected. 

Two  rank  detectors,  both  using  Chernoff-Savage^  test  statistics,  are  proposed 
for  the  signalling  alphabet.  Due  mainly  to  the  asymptotic  normality  of  the  Chernoff- 
Savage  test  statistics,  independent  of  which  signal  has  been  transmitted , the  resulting 
detectors  are  equivalent  to  a maximum  likelihood  detector  based  on  the  test  statistics. 
The  performance  of  these  detectors,  as  measured  by  Pitman -Noether  ARE,  is  shown 
to  be  the  same  as  their  binary  counterparts. 

A.  Signal  Selection  for  M-ary  Nonparametric  Detection 

Denote  by  U = {u*^\  i = 1,  • • • ,M]  the  set  of  possible  signal  vectors  and  V = {v^, 
j = 1 , • • • , N } the  set  of  nonparametric  partitions  and  let  ^ e v^  and  u^  ^ e v^ . 

Denote  by  d(x,y)  the  distance  between  x and  y,  where  x and  y could  be  either  vectors  or 
nonparametric  partitions,  or  a mixture  of  both.  Because  the  exact  noise  distribution 
is  unknown,  the  choice  of  a distance  is  quite  arbitrary.  However,  by  definition  of  non- 
parametric partitioning,  it  is  assumed  that 

d{u<"U<’^>)  = d(u<®>,u(^>)  U) 

= d(u^^^,v^)  (2) 

= d(Vg,v^)  (3) 

except  for  the  vectors  on  the  boundaries  of  the  nonparametric  partitions.  The  detection 
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,(r) 


rule  is  to  assume  that  u'  ' was  transmitted,  for  a received  observable  Z,  if 


d(Z,u^^^)  < d(Z,u^^^)  for  all  i ^ r 


(4) 


However,  if  there  are  two  signal  vectors  u^*^^  and  u^®^  belonging  to  the  same  nonpara- 


metric  partition,  v^,  then 


d(Z,u<’'h  = d(Z,v^) 


(5) 


(s) 


= d(Z,u'  ')  for  all 


(6) 


and  there  is  no  way  to  distinguish  u^^^  and  u^^^  unless  the  noise  distribution  is  known. 


in  which  case  the  problem  reduces  to  a parametric  one.  From  these  considerations, 
it  is  clear  that  the  signal  selection  for  a nonparametric  M-ary  detection  problem  de- 
pends on  the  manner  of  nonparametric  partitioning.  Specifically,  there  can  be  at  most 
one  signal  vector  per  nonparametric  partition  or  M < N. 

In  this  report,  only  M-ary  rank  detectors  are  considered.  The  rank  detectors 
are  based  on  the  nonparametric  partitions  which  result  from  separating  the  n-dimen- 
sional  Euclidean  sample  space  by  the  set  of  hyperplanes 

(7) 


{z^  = Zy  all  i / j } 


The  set  of  rank  partitions  {R.,  j = 1,2,  • • • , n!  } is  isomorphic  to  the  set  of  rank  vectors 


j = I,---  , n!  } . Furthermore,  since  each  rank  vector  is  contained  in  a rank 


J 


partition,  it  can  be  regarded  as  a representative  vector  of  the  rank  partition.  Often  the 
distance  between  the  rank  partitions  is  defined  on  the  basis  of  their  representative 
vectors.  For  example,  Spearman's  rank  correlation  coefficient  is  defined  as  the 
Euclidean  distance  between  the  rank  vectors. 


,(i) 


For  a rank  partition  R^,  if  u'  ' e Rj^,  then 


au^^^  E R. 

1 


(8) 


for  any  positive  constant  a.  Thus 


d(Z,u<"h  = d(Z,R.) 


(9) 


d(Z,au^^h 


(10) 


for  all  received  observables  Z.  It  is,  therefore,  logical  to  limit  the  signals  to  the  set 
of  equal -energy  signals,  i.e.,  if 


X'*') 
n 


(11) 


J 
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then 


r(i)  = y = E, 

k=  1 


i = 1.2, 


,M 


(12) 


where  E represents  the  energy  associated  with  the  signalling  set  U. 

The  actual  signalling  set  is  selected  as  follows.  Let  M be  the  number  of  signals 
and  n be  the  length  of  the  signal  vector.  For  simplicity,  assume  that  b and  q are  some 


integers  such  that 

n = b q 

(13) 

M < bl 

(14) 

Let 

u<^'  = (SjS^  •• 

• ^b> 

(15) 

be  one  of  the  signals. 

, where 

Sk=  akd  1 

1) 

(16) 

is  a vector  of  length  q and 

a , < a,  < • ■ • < a, 

I 2 b 


(17) 


The  signalling  set  is  then  formed  by  permuting  a,  s.  It  should  be  noted  that  the  permu- 
tation  is  performed  on  q components  of  u'  ' as  an  entity  and  a^^  corresponds  to  the  k-th 
level  of  the  signal  u^^\  Each  level  is  repeated  q times  to  increase  the  reliability  and 
also  to  guarantee  that  the  test  statistics  would  be  approximately  normally  distributed. 
This  normality  is  necessary  to  construct  a maximum  likelihood  detector.  There  are 
at  most  b!  signal  vectors  and  each  can  be  represented  by  vectors  of  length  b of  the  form 


b ' 


(18) 


which  are  permutations  of 

= (^1^2  ■■■  V 

To  each  signal  corresponds  a rank  vector  of  length  b,  which  is  the  appropriate 

permutation  of  the  identity  rank  vector 

= (12  • • • b)  (20) 

At  the  receiver,  the  observable  is  a vector  of  length  b q of  the  form 


■ft  ~i  ’-fcial  iiMi 


i 


i 
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Z - (Zu  • • • • • • z^q.  ■ • • . z^j  • • • 

Let  the  noise  vector  be  denoted  by 


N = (njj  •••  njq. 

...,  n^l  . 

(22) 

then,  when  u^^^  is  transmitted. 

^kr  = ®k  ’ + *^kr 

k = 1,2,  • 

•,b 

(23) 

r = 1,2,- 

■ • .q 

i = 1.2,  • 

■ • ,M 

The  noise  process  is  assumed  to  be  stationary  and  the  sampling  rate  sufficiently  low 


so  that  the  are  i.i.d.  random  variables  with  an  unknown  distribution  F(z).  De- 

note 

" ^^kr’  l,2,-.-,q}  (24) 

as  the  k-th  sample  of  the  observable  Z and  the  corresponding  rank  vector  by 

R(Z)  = (r^r^  •••  r^^)  (25) 

where  r . is  the  rank  of  the  j -th  sample  Z . among  {Z,  , k = 1 , • • • , b }.  Since 
J J 

Z^:F(z-s^^h  (26) 

it  follows  that,  when  u^^^  is  transmitted, 

R(Z)  = R^^^  (27) 

Thus  a detector  which  senses  the  stochastic  ordering  of  all  k = 1,2,  • • • ,b  can  dif- 
ferentiate among  transmitted  signals. 

B.  Two  Classes  of  Detectors 


The  detectors  proposed  here  are  motivated  by  the  form  of  the  optimum  M-ary 
Gaussian  noise  detector.  As  for  the  Gaussian  noise  detector,  the  detector  is  decompos- 
ed into  two  parts:  a sample  processor  and  M matching  networks,  each  corresponding 
to  an  element  of  the  signalling  set.  However,  while  only  the  sample  mean  (or 
Student's  t test)  is  generated  in  the  sample  processor  of  the  Gaussian  noise  detector, 
nonparametric  test  statistics  are  formed  in  the  sample  processor  of  the  proposed 
detectors.  The  matching  networks  of  the  nonparametric  detectors  are  designed  as 
follows.  Let  Lj  be  the  output  of  the  matching  network  corresponding  to  a signal 
u^"^^  e U,  and  E[Lj^/u^'^^]  be  the  conditional  expectation  of  L^  when  u^'^  is  transmitted. 


COMMUNICATIONS 


253 


The  matching  network  is  a device  with  the  property 

E[L./u^-’h  = max 
^ i 


(28) 


while  some  optimality  criterion  is  satisfied.  Since  in  most  of  the  cases  the  nonpara - 
metric  tests  tend  to  be  singular  for  large  sample  sizes,  the  matching  networks  ensure 
the  consistency  of  the  detector.  Finally,  the  detector  compares  all  L.'s  and  selects 

( i)  ' ^ 

the  signal  u ■* 

i 


corresponding  to  = max 


1 . Detector  I 

In  this  detector  the  sample  processor  forms  the  k-sample  version  of  the  two- 

4 

sample  Chernoff -Savage  test  statistics  as  developed  by  Puri.  If  the  bq  observations 
of  vector  Z are  ordered  so  that 


^(1)  ^ ^(2)  < < 2(bq) 


(29) 


the  Puri  Tj^- statistic  is  defined  by 


bq 


= - y d C,  k=l,-  -,b 
q r rk 


k q 


(30) 


r=l 


where  d^'s  are  ordered  constants  satisfying  certain  conditions  and 


■"rk 


1 if  z,  . is  from  Z, 
(r)  k 

0 otherwise 


(31) 


All  b of  the  Tj^-statistics  are  shaped  by  the  appropriate  matching  networks,  yielding  a 
new  test  statistic 


L.  = 
1 


b 

y 

_/ 


k=l 


’■k’k' 


(32) 


Since 


s 

k=l 


bq 

Tu  = - ^ d 

k q — < r 
r=l 


(33) 


must  be  satisfied,  all  Tj^'s  are  correlated.  It  can  be  shown  that  the  proposed  two- 
stage  nonparametric  M-ary  detector  is  equivalent  to  a locally  maximum  likelihood 
detector  using  Tj^-statistics . 

Introducing 
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H (z)  = i V F . (z) 
q'  ' h -J  qk' 
k=  1 


(34) 


where  F^j^(z)  is  the  empirical  distribution  of  and  H^(z)  is  the  empirical  distribution 
of  Z and  defining  the  score -generating  function 


J(x)  = lim  J (x) 

q->  00  ^ 


where 


Jq(r/bq)  = dr 


the  T^-statistic  may  be  rewritten  as 
00 

k=l,2,---,b 


(35) 


(36) 


(37) 


It  can  be  shown  that  T^^'s  are  asymptotically  normal.  Furthermore,  defining  the  vectors 
T = (TjT^  • • • \)'  (38) 

and 

= (^1^2  ■ ■ ■ 

where 


= j J(H(z))  dFj^(z),  k=l,2,---,l 
-00 

1/2/ 


(40) 


then  the  vector  statistic  q ' (T  -/i)  has  a limiting  normal  distribution  with  zero-mean 
and  known  covariance  function.  For  each  of  the  signals 


eCt/u^))]  = 


and 


cov(T/u^)h  = 

The  matching  networks  of  the  detector  are  formed  by 
K = T's(^),  i = 1, 2,  • • • ,M 


(41) 


Since  T is  asymptotically  normal  so  are  LJs  (linear  transformations),  or  are 

completely  specified  asymptotically  by 
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E[L./u‘")]  = 

e[t7u^^^ 

(42) 

II 

1 1 

w 

(43) 

var  [Lj^/u^^^] 

= 

L)i 

s(i) 

(44) 

var  [Lj^/u^'^^] 

= 

s(i) 

(45) 

,(i) 


(46) 


(47) 


Since  J(x)  is  usually  a monotone  function,  there  is  a strong  correlation  between  the 
transmitted  signal  u^'^^  and  the  mean  vector  of  the  statistic  T.  Consider  a specific 
signal  It  can  be  easily  shown  that 

mV’<  4‘*<  < 4“’ 

Denote  the  rank  vector  of  the  mean  vector  ji'*'  by 

R(M^"’)  = (R(»xl'h  R(H2 ' 

where  R(pc^^^  is  the  rank  of  fjL^^  among  ; k = 1,  • • • ,b},  then 

Let  Qj^  be  a permutation  matrix  such  that 

= Qj. 

Since  the  signals  are  permutations  of  one  another 


(48) 


(49) 


(50) 


and 


(51) 


or  the  stochastic  order  of  the  random  variables  {Tj^ : k = 1,  • • • ,b}  is  the  same  as  the 
order  of  : k = 1, 2,  • • • ,b}  when  u^^^  is  transmitted.  Forming  the  conditional 

expectation  of  L^^  when  u^'^^  has  been  transmitted, 


E[L./u^-'b  = q' 


(52) 


and  using  the  fact  that  the  right-hand  side  of  Ek}.  (52)  is  maximized  if  R(fi^'^^)  = it 

is  concluded  that 


.1  T." 
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ECL./u'-’']  = max  {E[L./«”  } 


which  is  the  desirable  property  of  the  matching  networks.  Since,  in  addition,  the  co- 

variance  matrix  T*.  approaches  the  null  matrix  as  the  sample  size  tends  to  infinity, 

‘-‘i  . (i) 

this  detector  is  consistent.  The  decision  rule  is,  therefore,  to  accept  the  signal  u'"’ 


L.  = max  {L.  ] 


2 . Detector  II 


In  practical  systems  it  may  be  more  convenient  to  compare  the  samples  pair- 
wise and  form  b(b  - l)/2  statistics  in  the  sample  processor.  Each  of  these  statistics 
will  indicate  the  stochastic  order  between  a pair  of  samples.  The  matching  networks 
will  perform  the  selection  of  signals  as  before.  However,  these  networks  should  have 
b(b  - l)/2  inputs  instead  of  b. 

For  the  same  signalling  alphabet  as  in  Section  A and  a pair  of  samples  and  Z^, 
the  statistics  formed  in  the  sample  processor  of  the  modified  detector  are 

■^rs  = ^ 

where 

= (z)  - F (z)) 

q 2 ' qr'  qs' 

and  F (z)  and  F (z)  are  the  sample  c.d.f.  of  Z and  Z^,  respectively.  Define  a 
qr  qs  r s 

random  vector 


V=<Ti2'^13 


it  can  be  shown  that  V is  asymptotically  normal  with  a known  mean  q'  ' and  covariance 
matrix  when  u^^^  has  been  transmitted. 

When  the  signal  separation  is  small  (low  signal-to-noise  ratio),  the  covariance 
matrix  is  independent  of  the  transmitted  signal  resulting  in 


if  k = r or  i = s 


k = s or  i = r 


otherwise 


where 
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= 2 //  F(x)(l  - F(y))  j'(F(x))  j'(F(y))  dF(x)dF(y) 
x<  y 


Intr  oducing 


rs 


r 


8 


S<i>  = (s<i)  sW  . . 


12  13 


,(i) 

’(b-l)b' 


then 

= G 

where  G is  a b(b  - 1)  by  b matrix 


(58) 


(59) 


The  modified  test  statistics  formed  in  the  matching  networks  are  then 


(60) 


W.  i=l,2.---,M  (61) 

ID* 

Since  all  W.'s  are  linear  combinations  of  V,  they  are  also  asymptotically  normal.  The 
corresponding  moments  are 

E [W. /u< = Ti  < G Vb  ( 62) 

Var[W./u^jb  = G'  A.GS^^V^^  (63) 


-''I.'*’'’' 


258 


COMMUNICA  TIONS 


C.  Performance  of  the  Detectors 

In  this  section  the  performance  of  the  two  detectors  in  terms  of  Pitman -Noether 
asymptotic  relative  efficiency  (ARE)  with  respect  to  the  optimum  detector  (sample 
mean)  in  gaussian  noise  is  considered.  Since  the  two  detectors  are  equivalent  for 
small  signal-to-noise  ratios,  it  suffices  to  find  the  ARE  for  detector  I. 

Usually,  the  ARE  is  defined  only  for  binary  detectors . In  order  to  adapt  the 
criterion  to  M-ary  detectors,  we  define  a pair-wise  ARE  of  two  detectors  used  to  test 
whether  u^^^  or  u^'^^  has  been  transmitted.  Denote  this  ARE  as  ARE(ij).  The  relative 
performance  of  two  detectors,  Dj  and  D^,  is  then  rated  by  the  average  of  these  pair- 
wise ARE'S 

= S ARE(ij)  (64) 

i<  j 

Since  the  a priori  probabilities  of  transmission  are  unknown,  the  pair-wise  detection 
is  symmetric,  i.e. , the  decision  rule  is 


accept  u^^^ 

if 

L.  > 

1 

L. 

1 

accept  u^^^ 

if 

L.  < 

1 

L. 

J 

Thus,  we  can  use  for  the  sake  of  analysis  an  equivalent  test  statistic 


= t'(S^^^  - S^j^  all  i j 


(65) 


Following  the  proc 
with 


edure  of  Section  B,  it  can  be  shown  that  L. . is  asymptotically  normal 


E[L../Jj^  = (66) 

and 

varCL^j/u^j^  = (S^^^  - (67) 

The  efficacy  of  is  readily  found  to  be 

^ ^ ^ F(z))  (68) 

-.00 

2 

The  efficacy  of  the  optimum  Gaussian  detector  is  l/o  . The  ARE(ij)  for  any  pair  of 
signals  is  then 
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ARE(i,j)  = 2 


( / ^J(F(z))dF(z))^ 
7 jV)dx  - (/ J(x)dx)' 


(69) 


which  is  the  same  as  for  binary  detectors  where  Chernoff -Savage  statistics  are  com- 
pared with  the  optimum  detector  in  Gaussian  noise. 

In  using  one  of  the  two  proposed  classes  of  detectors  the  structure  of  a particular 
detector  is  dictated  by  the  underlying  class  of  noise  distributions,  ease  of  implementa- 
tion and  the  signal  processing  time.  Though  the  analytical  results  based  on  ARE  are 
inconclusive,  the  proposed  approach  is  sufficiently  general  and  flexible  to  permit  a 
choice  of  a particularly  good  detector  structure  for  a given  application.  The  latter 
choice  can  be  easily  obtained  from  digital  computer  simulations  with  final  adjustments 
in  an  actual  system.  Because  of  the  form  of  the  detectors,  the  proposed  M-ary  non- 
parametric  detection  and  signal  selection  procedures  have  similar  advantages,  when 
compared  with  the  binary  procedures,  as  in  the  Gaussian  noise  case. 
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ON  IMPROVEMENT  IN  PERFORMANCE  OF  TRANSMISSION  SYSTEMS  IN  ADDITIVE 
GAUSSIAN  AND  BURST  NOISE 

L.  KurzandR.J.  Nawrocky 


In  recent  years,  the  general  problem  of  signal  and  detector  design  in  burst  noise 

has  been  of  considerable  interest.  Several  widely  varying  approaches  have  been  used 

1 -9 

to  attack  the  problem.  Recently  the  authors  introduced  a method  for  signal  and  de- 
tector optimization  in  burst  and  Gaussian  noise  using  two  approaches  within  the  frame- 
work of  the  discrete  maximum  principle In  this  presentation  alternate  approaches 
to  the  problem  using  multi -threshold  and  varying  threshold  procedures,  are  discussed. 
In  addition,  the  influence  of  burst  noise  on  M-ary  system  performance  in  Gaussian 
noise  and  intersymbol  interference  are  investigated. 


A.  Variable  Threshold  Systems 

Before  proceeding  with  the  description  of  this  proced<ire,  consider  the  decision 
process  based  on  a single  threshold  level  in  Gaussian  noise  alone.  In  this  case,  the 
input  to  the  detector  is 


Hq  : z(k)  = n^(k) 

(1) 

Hj  : = x(k)  + ng(h) 

(2) 

where  and  represent  the  hypotheses  that  there  is  no  signal  or  there  is  a signal 
present,  respectively. 


For  a zero-mean,  white  Gaussian  noise  of  variance  N^2,  under  the  hypothesis 
H^,  the  detector  output  at  k=  K is  Gaussian  with  mean  and  variance  given  by 


E[y(K)]  = A y z(k)g(k)  = Y, 
k=0  * 


(3) 


and 


Var[y(K)]  = ^ A ^ g^k)  = V 
k=0  * 


(4) 


where  the  notation  of  Ref.  10  has  been  used.  It  is  well  known  that  in  this  case  the  op- 
timum detector  is  based  on  the  comparison  of  y(K)  to  a threshold  y^,  and  the  setting  of 
the  threshold  depends  on  the  preselected  criterion  of  performance. 


In  additive  Gaussian  and  burst  noise,  on  the  other  hand,  the  mean  and  the  vari- 
ance of  y(K)  are  functions  of  the  burst  noise  parameters  c (location)  and  q (width).  If 
the  receiver  nonlinearity  suppresses  all  input  to  the  detector  during  the  burst,  the 
mean  and  variance  of  y(K)  under  the  hypothesis  Hj  for  any  permissible  values  Cj  and 


I r II  . 
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qj  are 


E {y(Cj,qj,K)  = < Yj 

(5) 

var  {y(Cj,q^,K))  = 

(6) 

It  is  obvious  that  in  burst  noise  decisions  based  on  the  fixed  level  (y^  is  the 
optimum  threshold  in  Gaussian  noise)  could  result  in  significant  errors  of  the  second 
kind.  Of  course,  threshold  levels  lower  than  would  increase  the  error  of  the  first 
kind  in  Gaussian  noise.  The  fact  that  a priori  knowledge  of  the  burst  width  may  be 
made  available  using  simple  threshold  logic  circuitry  suggests  the  use  of  a variable 
threshold  level  detection  scheme  which  in  effect  removes  the  dependence  of  £qs.  (5) 
and  (6)  on  q and  thus  decreases  the  errors. 

From  practical  considerations,  the  detection  procedure  could  be  implemented  in 
terms  of  a mamber  of  fixed  levels  rather  than  a single  variable  level.  Implementations 
of  burst  suppressor  and  multi -threshold  detectors  are  shown  in  Figs.  1 and  2,  respec- 
tively. 


Fig.  1.  Functional  diagram  of  a burst  suppressor. 

B.  Effect  of  Transmission  Interval  on  System  Performance 

In  considering  the  effect  of  a variable  transmission  interval  on  system  perform- 
ance in  burst  noise,  two  factors  become  immediately  apparent.  The  length  of  the 
transmission  interval  T = AK  and  the  channel  bandwidth  are  directly  related  through  the 
channel  input-output  state  difference  equation  (see  Kef.  10)  so  that  for  a given  fixed  in- 
put signal  energy,  the  transmitted  energy  is  directly  proportional  to  T.  In  addition, 
for  given  maximum  burst  width,  the  effective  width  is  inversely  proportional  to  the 
length  of  the  interval  T. 
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Fig.  2.  Functional  block  diagram  of  a receiver 

containing  a multi-threshold-level  decision 
circuit. 

As  a numerical  example,  consider  the  performance  of  the  systems  of  Refs.  10 
and  11  as  a function  of  T for  a fixed  set  of  channel  parameters  and  a given  finite  sig- 
nal energy.  With  Eg  = 10,  a = p = 1,  and  unit  energy  correlation  functions,  the  two 
systems  are  shown  in  Figures  3 and  4.  The  performance  of  these  systems  in  terms  of 
Pq  and  where  these  parameters  are  defined  in  Ref.  10,  is  plotted  in  Fig.  5 as 

a function  of  T for  a maximum  absolute  burst  width  of  .433.  It  is  interesting  to  note 
that  the  performance  of  either  system  can  be  greatly  improved  by  increasing  T even  by 
a small  factor.  Generalizing  the  above  results,  it  can  be  stated  that  increasing  the 
transmission  interval  improves  system  performance  in  burst  noise  regardless  of  the 
signal  waveshape. 

Since  a longer  transmission  interval  corresponds  to  a lower  data  rate,  a com- 
promise between  the  value  of  T and  the  magnitude  of  the  noise-to-signal  ratio  is  re- 
quired. It  should  be  noted  that  for  a given  T,  the  data  rate  can  be  increased  by 
employing  other  modes  of  transmission  such  as  ternary,  quaternary,  or,  in  general, 
M-ary.  In  the  case  of  M-ary  transmission  over  a bandlimited  channel,  the  optimal 
number  of  signals  is  related  to  the  transmission  interval,  the  noise-to-signal  ratio, 
and  the  complexity  of  the  receiver.  This  problem  is  discussed  in  detail  elsewhere  in 
this  report.  In  the  next  section,  an  indication  will  be  given  about  the  influence  of  burst 
noise  on  the  performance  of  M-ary  system  in  gaussian  noise  and  intersymbol  interfer- 


ence. 


Fig.  5.  Effect  of  the  length  of  the  transmission 
interval  on  system  performance  in  burst 
noise. 

The  Influence  of  Burst  Noise  on  the  Performance  of  M-ary  Systems  in  Gaussian 
Noise  and  Intersymbol  Interference 

Consider  two  quaternary  alphabets  based  on  Hadamard  matrix  transformation 
(see  the  presentation  on  M-ary  detection  in  this  report).  The  performance  of  systems 
using  these  alphabets  in  additive  Gaussian  and  burst  noise  in  connection  with  a non- 
linear suppressor  is  illustrated  in  Figures  6 and  7.  In  particular.  Fig.  6 shows  the 
effect  of  burst  noise  on  the  cross  correlation  between  the  various  signals  and  detectors 
defined  by 

K 

C (c,w,K)=A  Yt  x.(c,w,k)gj(k)  (7) 

k=o 

Figure  7 shows  the  variation  in  the  detector  output  noise-to-signal  ratio  as  a function 
of  the  burst  center  position  c for  each  of  the  signals  of  the  two  alphabets.  The  above 
performance  results  are  summarized  in  Table  I foi  N^  = 1 and  the  normalized  burst 
width  w = .467.  The  correlation  between  the  individual  signals  and  detectors  is  also 
given  in  the  table. 
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TABLE  I.  Performance  of  Hadamard  transformed  quaternary  orthogonal 
systems  in  additive  Gaussian  and  burst  noise;  Bg  = 10,  Nq=  1, 
w = 0.  467 . 


a.  System  1 


Signal 

Gabor  BW 

Po(K) 

P (K) 

^max'  ' 

®1 

82.0 

0.  990 

2.  53 

®2 

80.  8 

0.  985 

2.  46 

66.  1 

1. 052 

2.  46 

®4 

66.  2 

1.068 

2.  31 

Correlation  C. 


45 

06 


.0146 

0176 


b.  System  2 


Signal 

Gabor  BW 

Po(K) 

max 

= 1 

69.9 

0.  994 

1. 84 

®2 

75.  5 

1.021 

1. 37 

70.  1 

1. 013 

1. 86 

75.  8 

0.  990 

1.30 

Correlation  C. 


-.0066 
-.0179 
. 6977 
-.0172 
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ON  ESTIMATION  OF  CHANNEL  CONDITIONS  IN  DIGITAL  COMMUNICATION  SYSTEMS 
L.  Kurz 

The  field  of  estimation  of  channel  conditions  or  channel  monitoring  entails  the 
estimation  of  the  error  rate  in  telemetry  and  data  communication  systems.  The  ap- 
proaches fall  into  two  categories  which  are  designated  as  direct  (estimating  the  actual 
error  rate)  ’ and  extrapolated  (estimation  of  a parameter  of  a known  distribution).^’^’ 
The  effort  here  is  concentrated  on  a direct  method  useful  in  an  M-ary  (M  > 2)  data 
transmission  or  telemetry  system.  A nonparametric  estimator  of  noise  conditions, 
based  on  the  output  from  M matched  filters  of  which  at  a given  time  usually  one  is 
active  (signal  and  noise)  and  M-1  are  passive  (just  noise),  is  introduced.  The  proposed 
estimator  is  based  on  intuitive  reasoning  and  is  shown  to  have  good  qualities  in  estimat- 
ing the  true  average  probability  of  error.  An  expression  for  the  mean  of  the  estimator 
is  developed  in  terms  of  the  signal  and  noise  distribution,  and  it  is  shown  to  be  close  to 
the  true  average  probability  of  error. 

A.  Channel  Description 

For  M-ary  transmission  the  transmitter  sends  one  of  M-signals  (S,,S,,  • • • ,S. ,). 

1 c M 

The  receiver  is  composed  of  M matched  filters  each  of  which  is  matched  to  one  of  the 
signals.  At  the  end  of  each  signaling  interval,  the  outputs  from  each  of  the  M filters 
are  sampled.  If  the  filter  matched  to  has  the  largest  sampled  output,  the  signal 
classifier  assumes  that  had  been  transmitted.  If  S.  is  transmitted  with  a priori 
probability  p^,  the  sampled  output  of  the  k-th  filter  (k  ^ i)  is  equal  to  n^^  (just  noise), 
while  the  sampled  output  of  the  i-th  filter  is  equal  to  Sj^  + n^-  It  is  assumed  that  the 
p.d.f.  's  of  all  the  idle  and  active  filters  are  independent,  the  p.d.f.  's  of  all  the  idle 
filters  are  the  same,  and  the  p.d.f.  of  the  active  filter  is  the  same  independently  of 
which  signal  was  sent.  Hence, 

'a..n 

J k J k 

where 

fn  {•)  = p.d.f.  of  the  j-th  idle  filter 

j 

(•)=  p d.f.  of  the  k-th  idle  filter 
k 

and 

^n  n ^ ” joint  p.  d.  f.  of  the  j-th  and  k-th  idle  filters 
j'  k 
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and 

f . (x)  = f . (x)  for  all  i and  j 

Slg.'  Slgj' 

where 

f . (x)  = £ . (x)  = p.  d.  £,  of  S.  + n. , for  all  i. 

Slgj  Slg'  1 1 

Hereafter  attention  will  be  given  to  the  combined  output  rather  than  indi- 
vidual outputs  Sj  and  n^^.  The  only  necessary  assumption  is  that  the  output  of  be 

independent  of  the  output  n^^  (k  / i).  Therefore,  the  assumption  that  the  noise  output 
in  the  active  filter  be  independent  of  the  signal  output  in  the  active  filter  is  unnecessary, 
and  the  solution  to  the  monitoring  problem  may  be  extended  to  nonlinear  processing  by 
the  receiver. 

B.  Expression  for  the  Probability  of  Error 

The  probability  of  error  is  equal  to  the  probability  that  one  of  M - 1 independent 
noise  samples  will  be  greater  than  the  output  of  signal  plus  noise,  (One  should  note 
that  since  the  probability  of  error  is  the  same,  independently  of  which  signal  was  sent, 
the  probability  of  error  will  be  independent  of  Pj^.)  The  probability  of  error  is  equal  to 
the  probability  that  the  signal  will  be  equal  to  x,  multiplied  by  the  probability  that  the 
maximum  of  the  noise  samples  is  greater  than  x,  summed  over  all  possible  values  of 
X or 

00  00 

P = r f . (x)  f f (y)  dy  dx  (1) 

e J ^ Slg  J max  ' ’ 

-00  » X 

where 

f (v)  = o.d.f.  of  the  maximum  of  M - 1 independent  noise  samples 
max'" 

It  should  be  noted  that  for  M=  2,  Eq.  (1)  reduces  to 
^e"/  ^ig*’'^  yy)dy]dx 

-00  “ X 


oo 

= 1 - r f . (x)  F (x)  dx  (2) 

.00 

Since  the  probability  of  error  is  small,  the  maximum  value  of  the  sampled  outputs  of 
the  idle  filter  nearly  represent  the  maximum  value  of  the  M - 1 independent  noise 
samples,  and  the  sampled  outputs  of  the  active  filter  nearly  represent  samples  of 
signal  plus  noise.  The  seimples  are  then  taken  from  the  maximum  of  the  idle  filters 
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and  from  the  active  filter,  and  a linear  statistic  is  calculated  using  the  values  of  the 
two  outputs . 


The  statistic  used  in  estimating  the  average  probability  of  error  is  a modified 
version  of  the  Mann-Whitney  test.^  Let  x^.x^,  • • • , Xj^  represent  k sampled  outputs  of 
the  maximum  of  idle  filters,  and  yj»y2«  ‘ ’ y^  represent  the  k corresponding  outputs 

from  the  active  filter.  The  statistic  used  is 


W'+  k(k  - 1) 
2k(k  - 1) 


(3) 


where 

k k 

W'  = ^ ^ sgn(x^  - y ) 

i=Ij=l 
i i j 

and  the  standard  Mann-Whitney  test  is 


(4) 


The  choice  of  the  estimator  given  by  Eq.  (3)  is  not  arbitrary  but  follows  from  logical 
reasoning.  For  the  ordinary  Mann-Whitney  test,  W,  it  is  well  known  that  the  mean  is 

E[W]  = ?■*■-?■ 

where 


= Prob[x  > y]  and  P = Prob[x  < y]  (5) 

Since  x and  y have  continuous  c.  d.  f.  's,  P(x=  y)  = 0 and  P^  + P = 1 . P^  can  be  expres- 
sed in  terms  of  E[W]  as 

E[W]  = [P+  - (1  -P'^)]  = [ZP"^  - 1]  (6) 

and 

P*  = ■ * f "J  (7) 

Since  P^  is  the  quantity  which  is  sought,  the  measured  value  of  W is  substituted  for  its 
mean  to  obtain  an  estimator  of  the  average  probability  of  error. 

Since  the  estimate  of  P_  is  in  the  form  aW^+  b and  W^  is  asymptotically  normal, 
it  follows  that  P„  is  asymptotically  normal.  The  random  variable  P_  is,  there- 
fore,  completely  specified  by  its  mean  and  variance,  with  the  mean  satisfying 
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(8) 


m 


where 


*"idle=  ^ 


(9) 


Since  is  a sample  of  the  idle  filter  and  is  a sample  of  the  active  filter,  it  is  neces 
sary  to  find  the  p.  d.f.  of  each  of  the  above  two  filters.  Denoting  the  p.d.f.  's  of  the 
two  filters  by  ^act^*^’  °”®  obtains 


fidie(x)  = p.d.f.  of  min  {a  sample  of  a sample  of  f^j^(x)  ] 


and 


f (x)  = p.d.f.  of  max  {a  sample  of  f . a sample  of  f . (x)} 

cLCV  ITlAJv 


It  is  eash  to  show  that 


F (x)=f"""^x) 
max'  ' n 


F ^(x)  = F . (x)  f'^'^x) 
act'  ' sig'  ' n ' ' 


"act 
Similarly 


t”  - ‘J  ♦ ‘sig'-'C 


f'idleW  = + C‘‘W  ■ 


Slg 


j,m  - 1 

■ n 


Slg 


and 


Since 


**idle  ' / . ^act^^'^^  ■ ^idle^^^^*^* 


.00 


I ^act<’‘>^idle<^)‘^’‘ 


idle' 


(10) 


(11) 


(12) 


(13) 


combining  Eqs.  (12)  and  (13),  yields 

‘’idle  = ‘ - / J'',ig<’‘>  ♦ - ‘'.Ig'^'C*'-*’  C'-elg'-''"'  - 


f . (x)  F”’"^x)]dx 
Slg'  n 


Figures  1-3  represent  typical  performance  curves  for  indicated  signal  and  noise 
conditions.  From  the  curves  it  follows  that  the  estimator  suggested  here  compares 
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Fig.  1.  Estimators  of  probability  of  error  as  a 
function  of  equivalent  signal  - noise  ratio 
parameter . 

favorably  with  the  theoretical  and  counted  error  rate  for  three  different  classes  of 
channels . 

Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  L.  Kurz 
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Fig.  2.  Estimators  of  probability  of  error  as  a Fig.  3.  Estimators  of  probability  of  error  of  a 

function  of  eqviivalent  signal-noise  function  of  equivalent  signal-noise 

ratio  parameter.  ratio  parameter. 
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A ROBUST  VECTOR  ROBBINS-MONRO  PROCEDURE  WITH  APPLICATION  TO 
PARAMETER  ESTIMATION  IN  m -INTERVAL  DETECTORS 

P.  Kersten  and  L.  Kurz 

7 

In  a recent  report  the  use  of  the  Robbins -Monro  (R-M)  and  the  Kiefer -Wolfowitz 
(K-W)  procedure  to  create  robust  recursive  estimators  was  illustrated.  In  both  of 
these  scalar  estimation  procedures,  the  rate  of  convergence  is  dependent  upon  the 
choice  of  the  gain  coefficient  which  in  part  controls  the  step  size  of  this  iterative  cor- 
rective procedure.  To  enhance  the  rate  of  convergence,  adaptive  gain  coefficients 
which  closely  estimate  the  optimum  gain  coefficients  are  inserted.  In  addition,  by  us- 
ing non-linear  regression  functions,  these  recursive  estimators  can  be  made  robust. 
Both  of  these  concepts  may  be  extended  to  produce  similar  results  in  the  vector  R-M 
procedure. 

Like  the  scalar  version  of  the  R-M  procedure,  several  authors  have  considered 

5 4 10  4 

the  convergence  of  the  vector  R-M  procedure.  Dvoretzky,  Derman  and  Sacks  ’ 
have  considered  one  or  more  forms  of  convergence  of  the  vector  R-M  procedure.  Under 
suitable  conditions  the  vector  R-M  procedure  converges  in  mean-square,  wpl,  and  in 
law.  In  Section  A,  a theorem  which  establishes  the  asymptotic  normality  of  the  vector 
R-M  procedure  with  adaptive  gain  matrices  is  stated.  This  is  the  vector  analog  of 
Theorem  1 of  Reference  7.  In  Section  B,  an  application  of  this  procedure  to  parameter 
estimation  in  m -interval  detectors  is  presented.  It  is  shown  how  adaptive  gain  matrices 
are  used  in  the  estimation  of  the  quartiles  of  a distribution.  The  resulting  recursive 
estimator  is  not  only  computationally  attractive,  but  also  robust. 


A,  The  Vector  Robbins -Monro  Procedure 

The  us\ial  version  of  the  vector  Robbins -Monro  procedure  (hereafter  referred  to 
as  the  R-M  procedure)  is  given  by  the  recursion  equation 


— n+1 


= X 
— n 


a (Y(X  ) - q) 
n — — n — 


where  X , X . Y(X  ) and  a are  q-dimensional  vectors  and  {a  ] is  a sequence  of 
positive  scalar  constants  s.t.  ^a^  = oc  and  < <».  If  the  stochastic  approximation 
(S.A.)  procedure  is  visualized  as  q distinct  (possibly  dependent)  scalar  S.A.  processes 
instead  of  a single  q-dimensional  process,  the  scalar  gain  coefficient  should  be  re- 
placed by  a gain  matrix.  The  advantage  of  this  is  to  "match"  the  diagonal  entries  in  the 
gain  matrix  to  the  q distinct  S.A.  procedures.  If  the  gain  coefficient  a^  is  adaptive, 
then  a generalizes  to  a random  gain  matrix  which  we  denote  as  A (•)  = A (X,  • • • , X_)- 
Accordingly,  the  conditions  = * and  ^a^  < oo  must  now  be  replaced  with  similar 

conditions  holding  wpl  upon  the  eigenvalues  of  the  gain  matrix  A^(-).  We  adopt  the 
Euclidean  norm  as  our  metric  and  the  corresponding  matrix  norm  that  it  induces,  i.e.  , 

9 

the  spectral  norm. 
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The  R'M  procedure  is  defined  by  the  recursion  equation 

X = X - i A (X,,  • • • , X ) [ Y(X  ; 0)  - a ] 

where  X^,  ^)»  q-dimensional  vectors,  EY(X;  0)  = M(X)  and 

M(X)  = a has  the  unique  solution  X = 6.  The  adaptive  gain  matrices  are  assumed 

to  be  positive  definitive  with  eigenvalues  )>  ••■>a^”\*  )>  0 s.t.  y'a^*'^(‘)/n  = <»  wpl.  > 

^ 10  91  *1 

The  assumptions  follow  closely  those  of  Sacks  in  spirit  and  form. 

The  first  three  assumptions  establish  the  relationship  of  the  i.i.d.  random  vec- 
tors Z^(X)  to  the  regression  function,  namely,  the  noise  is  additive  and  zero  mean. 

(1)  e[y(Xjj;  0)lx„.  •••.  Xi3  = ECY(X^;  0)lX^]  = M(X^:  0) 

abbreviated  M . 

— n 

(2)  Y(X  ; 0)  = M(X  ; 0)  + Z(X  ) = M(X  ) + Z(X  ) = M + Z where 

EZ  = EZ(X)=E(ZlX)=E{ZIX,---,X,)=0  wpl. 

n — n'— n n'— n'  n'— n’  ’ —1' 

(3)  M(  Xj^:  ^ unique  root  i.  e. , X = £ . 

To  show  convergence,  the  class  of  regression  functions  must  be  restricted  in  the 
Scime  manner  as  the  scalar  case.  The  following  three  assumptions  serve  this  purpose. 

(4)  (X  - e)^  (M(X,  e)  - o)  > 0 for  all  X 4 6 • 

(5)  KllX  - 0 11  < |1M(X;  0)  - “11  < KjlX  - 0 II  for  llX  - oil  < q and 

Kq  < 11M(X,  0)  - Q |l  < KjlX  - £ll  for  q < HX  ‘ ill  < 

(6)  M(  X,  i)  = “ + B(X  - ^)  + ^(X,  i)  where  B is  a positive  definite  qxq  matrix 

s.t.  1|b11  < 00  and  l|6(Xjj.  0)1|  = OCHx^  - ill)  as  X - i^  0. 

, j j 

Further,  there  exists  an  orthogonal  transformation  P.s.t.  PP  = I,  P BP  = diagonal 
T 

matrix  and  P A^P  = diagonal  matrix  for  each  n. 

As  in  the  scalar  case,  the  additive  noise  vector  Z(X)  must  have  a uniformly 
bounded  variance  and  have  a well  defined  covariance  matrix  as  X ->  £ a.  s.  In  addition, 
t the  adaptive  gain  matrix  needs  to  be  well  behaved  and  a consistent  "mean  square” 

estimator  of  a constant  matrix.  These  requirements  are  reflected  in  the  following 
assumptions: 

(7)  (a)  sup  Eli  Z(  X)  II  < 00  for  some  e > 0 

X 

(b)  lim  E[Z(X)Z  (X)]  = where  ir  is  a non-negative  definite  matrix  and 
X -»  0 “ 
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where  the  limit  is  in  the  sense  of  the  norm, 

(8)  > • • • > a^*'^  > 0 be  the  eigenvalues  of  A and  b,  >b,  > 0 

1—  t—  — q n 1—  — q 

be  the  eigenvalues  of  B and  > a^  > • • • ^ > 0 be  the  eigenvalues  of  A 

(a)  0 < a'  < inf  |(a  ((  < sup  |(A  j(  < a"  < oo  wpl  for  n large 

‘ X X " “ 

(b)  lim  ^ ^ ® where  A is  a constant  matrix  s.t.  a'<  ||a||  < a." . 

(9)  a'  b - e > l/2  and  a b - e > l/z 

1 q q q 

Assumption  (9)  has  no  readily  apparent  physical  interpretation  other  than  it  is  required 
for  convergence  within  the  proof. 

Theorem  1:  Under  assumptions  (1)  - (9),  let  r.  > r„  > • • • > r > 0 be  the  eieen- 
1/2  1 2 q 

values  of  AB.  Then  n ' (X^  ' ^)  is  asymptotically  normal  with  mean  zero  and  co- 

variance  matrix  PQP^  where  Q is  a matrix  whose  (i,j)^^  element  is  a.ir?.a.[r.  + r.  - l]"^ 

* ,-.-1  ^ 1 U J ^ J 

where  ir  = P ir  P. 

Proof:  See  appendix  of  Reference  1. 

B,  The  Simultaneous  Estimation  of  the  Quar tiles  of  a Distribution  Function 

In  this  example,  the  R-M  procedure  is  used  for  the  simultaneous  estimation  of 
the  quartiles  of  a symmetric  distribution  given  random  samples  from  the  distribution. 

It  will  become  clear  that  this  procedure  generalizes  to  the  simultaneous  estimation  of 
n-arbitrary  quantiles.  The  use  of  adaptive  gain  matrices  in  conjunction  with  the  R-M 
procedure  ensures  that  this  recursive  estimator  converges  at  a "near"  optimum  rate. 

For  real  time  estimation  of  quantiles  this  becomes  an  important  consideration.  For 
instance,  the  nonparametric  m -interval  partition  detector  considered  by  Ching  and 
Kurz  requires  the  estimation  of  a finite  number  of  quantiles  of  the  noise  distribution 
for  its  implementation.  To  enable  this  particularly  attractive  detector  to  adapt  to  a 
slowly -varying  noise  distribution,  training  samples  can  be  taken  periodically  to  update 
the  estimates  of  these  quantiles.  Since  the  R-M  procedure  is  recursive,  it  requires 
little  sample  storage  and  is  amenable  to  digital  computation.  In  fact,  it  is  ideally 
suited  for  this  task.  Clearly,  the  adaptiveness  of  the  detector  will  be  dependent  in 
part  upon  the  rate  of  convergence  of  the  quantile  estimation  procedure.  In  commiinica- 
tion  channels  where  the  noise  distribution  is  a linear  combination  of  a low  variance 
noise  distribution  and  a Burst  Noise  component,  it  is  also  necessary  that  this  estima- 
tion procedure  be  robust.  In  this  example,  the  robustness  will  be  guaranteed  by  the 
fact  that  our  regression  function  is  bounded  and  non-linear,  and  our  adaptive  gain 
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matrices  are  based  upon  Enear  order  statistics.  It  is  this  very  problem  which  moti- 
vates the  generality  of  Theorem  1 . 

Before  proceeding  to  the  three-dimensional  case,  the  one -dimensional  case  will 
be  examined  to  provide  an  easier  setting  in  which  to  discuss  the  properties  of  the 
linear  order  statistics  used  in  the  adaptive  gain  matrices.  In  the  scalar  case,  the 
recursive  equation  becomes 

X , = X --AS  (X  , X J 

p+1  p p p m p k 

where 


I 

V = S I 


i+m(p-  1) 


] + (1  -2XJ 


p-  I 

^p  " 2(p-l)  [mXj^+  1 ■ ^j,  [mXj^]^ 


where 


order  statistics  from  the  random  samples  Y^+jjj(p-i)’  ^ ™ with  CDF  F(x)  and 

[mXj^]  defined  to  be  the  greatest  integer  less  than  or  equal  to 

The  estimator  A is  essentially  the  same  one  used  in  Ref.  6 and  is  a biased 

P . ^ 

estimator  of  the  optimum  gain  coefficient  l/2f(0j^)  where  is  the  quantile.  Since 

A is  an  average  of  i.i.d.  random  variables,  if  the  E|Zj  ] + 1 “ [mX  ] I 

P lic  h 

then  Ap  converges  a.  s.  to  E[Zj  . [mXj^]+  1 - [mX^]^ 

Nximbers.  Thus  to  study  the  limit  of  A^  is  to  evaluate  the  expected  value  of  A^,  assum- 
ing it  is  finite.  Unfortunately,  the  latter  task  is  in  general  difficult  and  one  must 
resort  to  numerical  methods  of  integration  to  obtain  accurate  answers.  More  impor- 
tantly, in  the  one -dimensional  case,  one  must  be  careful  that  A -jtl/(4f(0,  ))  = A a,  s. 
since  it  is  well  known  that  Jp  Xp  » N(0j^  , A o /(2  A a - 1))  where  a = 2 and  o = 

Var  S . If  A ->  l/(4f(0.  ))  a.  s.  then  the  variance  is  infinite  and  the  convergence  is 
m p 

meaningless.  This  variance  will  be  minimum  if  A a = 1 (Reference  8).  If  Ap  is  biased, 
it  is  best  that  it  be  greater  than  l/(4f(0.  ))  so  that  the  rate  of  convergence  is  "close"  to 


optimal.  To  do  this  a lower  bound  is  established  on  the  estimator 


* • . 
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For  convenience,  consider  ^^(i+ 1)  * ^(i)^’  estimator  of  j/f{0).  Its 

expected  value  is  given  by 

1 

E6.  = (n+i)  E[Z^.^j)  - = (n+  1)  jf  G(x)  CP.^j(x.n)  - P^(x,n)]dx 


where 


.th 


n-x 


the  PDF  of  the  order  statistic  from  a uniform  random  sample  of  size  n and 


G(x)  = inf  ty  :F(y)  > x}  = F'^x) 

y 


F(x)  being  the  CDF  of  the  random  samples.  We  now  defined  a class  of  distribution 
functions  ^(i)  as  those  CDF's  which  possess  the  following  two  properties. 

<1)  Um  F'^(x)P^^j(x,n+l)  = 0 as  x -»  0+  and  x->l-  and 


(2)  G'(x)  = l/f(F'^(x))  is  a convex  function  for  x e(0,  1). 

The  first  property  allows  us  to  evaluate  the  expectation  of  6.  via  integration  by  parts  as 


E6.  = -F 
1 


-1 


, ^1  . 
(x)p..  .(x,n+l)l4  / G'(x)P  (x,n+l)dx=  f 

IT  * O n 


1 


0 «F'  (xll 


1, 


Pi^j(x,n+1)  dx 


The  second  property  permits  us  to  apply  Jensen's  Inequality  to  conclude  that 

E6.  = EG'(X)  > G'(EX)  = G'(^)  = l/KF'^  (^)) 

The  resulting  class  of  functions  ^(i)  contains  the  Gaussian,  Cauchy  and  Laplace  distri- 
butions and  this  lower  bound  ensures  that  (l/2)E6^  is  strictly  greater  than  l/(4f(0j)) 
for  these  distributions.  Reference  to  Table  I will  confirm  that  this  bound  is  conserva- 
tive and  the  calculated  values  of  E6^  = exceed  l/2f(0^)  for 

the  three  aforementioned  distributions.  However,  if  the  distribution  is  unknown  except 
for  the  fact  that  f(x)  < f(y)  for  jx]  > jyl  and  F(x)  e ^(i-1),  then  one  can  conclude  that 
the  estimator  Z.  . - Z.  . , instead  of  Z - Z will  have  a lower  bound  of 

l/2f(F*‘ (j^))  > l/2f(F'^  (i/n))  = l/2f(0.)  for  0.  <F'^1/2),  which  suffices  for  sym- 
metric density  functions.  This  provides  a safer  estimate  but  in  view  of  Table  I is  un- 
warranted for  this  particular  example. 
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TABLE  I 


Quartile 

First  Quartile  i 

= 4 

Second  Quartile 

i = 8 

Third  Quartile  i : 

12 

Distribution 

E6./2 

1/2  f(6.) 

Lower 

Bound 

E6./2 

1/2  f{0.) 

Lower 

Bound 

E6/2 

l/2f(ej) 

Lower 

Bound 

Laplace 

2. 129732 

2.0 

1.8 

1.270854 

1,0 

1. 0 

2. 129723 

2.0 

1.8 

Gaussian 

1. 642492 

1. 573113 

1. 49 

1.312659 

1.255314 

1.253314 

1 . 642452 

1. 5731 13 

1. 49 

Cauchy 

4. 247329 

3. 141593 

2.  68 

1.846653 

1. 570796 

1.  570796 

4. 247329 

3. 141593 

2.  68 

Having  established  the  properties  of  A^  for  symmetric  distributions,  the  exten- 
sion to  the  3 -Dimensional  case  is  straightforward.  Toward  this  end,  define  the  gain 
matrix  by, 

“ p- 1 

y (Z.  rm-,^,  - Z.  rm-,)  0 0 

2(p-i)  j.l-tJ+i 


A = 
P 


3=1 


4 

p-1 

y (Z.  rm-,^.  - Z.  rm-,) 

2(p-i)  ^^1  j.L-j]+i  J.t-zJ 


p-1 


and  the  vector  version  of  S by 

m * 


E 

i=l 

sgn[X 

Pi 

^i+m(p- 

1)]  + (1  ■ 

■ 2\j) 

m 

E 

i=l 

sgn[X  - 

Pz 

^i+m(p- 

1)]  + (I  • 

.2X2) 

m 

E 

i=l 

sgn  [X 

P3 

^i+m(p- 

j)]  + (1  • 

-2X3) 

The  regression  fvinction  M(Xp,0)  = ES^(Xp,\)  is  now  given  by 
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F(Xp^)-X; 

M(Xp,0)  = 2 

F(Xp2>  - X2 
F(Xp^)-X3 

= 2 

f(6j) 


0 


£(93) 


X - 0 ,■ 
Pi  * 

X - 0, 

P2  2 

X - 0^ 

_ 

L ^3 

f'(9j) 

0 


£'(62) 


£'(63) 


(Xp^-ep 


+ H.O.T.  's 


The  B matrix  is  given  by 

[£(0j)  0 0 

B = 1 0 £(62)  0 

0 0 £(83) 


and  the  additive  noise  vector  Z(Xp,^)  is  given  by 


Z 

-P 


1 

— ^ sgn(Xp  - jj  ^ j _ 2p(x  ) 

i=  1 1 P j 

m 

i T,.g„(X  -V„,p.,,  + 1 -2F(X„  1 


i=l 


PZ 


111 

|^^_^^sgn(Xp^- Yi^rn(p-l)  + * ' ^ ^<^P3^ 


6. . 


And  by  the  result  established,  we  are  sure  that  A^^  > ^ ^ and  thus  [AB]^j  - 6^^^ 

where  r^^  > l/2. 

With  the  above  notation,  conditions  (1)  and  (2)  are  transparent  and  condition  (3) 
is  satisfied  i£  F(x)  is  strictly  increasing  and  continuous  which  we  simply  assume.  I£ 

X.  = i/4,i  = 1,2,3  with  corresponding  quartiles  0^,  then 
^ 3 

(X -0)’^(M(X.0) -a)  = (X.  - 0.  )(F(X.)  - X. ) > 0 i£  X.  4 0. 

i=l 

since  X^  > 0^  impUes  F(X^)  > X^  and  X^  < 0^  impUes  F(X.)  < X^  by  our  assumption  that 
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F(x)  be  strictly  increasing.  To  satisfy  conditions  (5)  and  (6),  we  assume  f(x)  is  con- 
tinuous a.e.  and  its  support  is  convex.  Moreover,  it  is  assumed  that  for  all  x,  f(x)  is 
bounded  f(0.)  > 0,  and  0 < lf'(0j^)l  < o®.  i=  1.2,3.  The  last  assumption  along  with  the 
expansion  of  M(Xp,0)  guarantee  condition  (6).  Condition  (5)  is  also  satisfied  since 

llM(x)  - air  = ^ (F(x.)  - X.)^ 


and  the  assumptions  on  F(x)  mean  geometrically  that  each  F(Xj^)  - is  bounded  above 
and  below  as  in  Figure  1.  Since  is  diagonal,  it  follows  that  the  matrix  P specified 
after  assumption  (6)  is  taken  as  the  identity  matrix. 


The  sgn  function  which  appears  in  each  component  of  assures  that  E||  Z(x) 
is  bounded  uniformly  in  x which  in  turn  implies  condition  (7a).  The  covariance  matrix 
as  X -»  0 is  calculated  to  be 


[it], 


— min(X.,X.)[l  -max(X.,X.)] 
m 1 j 1 J 


and  n is  positive  definite  thus  verifying  condition  (7b).  The  form  of  [Ap]^  allows  one 
to  state  that  " ^j’i  ^ ^ strictly  increasing  and  continu- 

ous and  therefore 


lim  [a  ]. . > 0 wpl  for  1,2,3 
P ^ “ 


which  yields  condition  (8  b).  The  above  argument  establishes  the  first  half  of  condition 

(8a)  and  the  fact  that  E[A  ]..  is  finite  for  all  i suffices  to  establish  that  sup||A  ||  < ocwpl. 

p 11  p 

Conditions  for  the  existence  of  both  the  expectation  and  the  variance  for  the  order 
statistics  is  given  in  Ref.  2,  page  44.  For  our  particular  example,  we  shall  assume 
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n=  16,  i=  4 and  thus  even  if  F(x)  is  Cauchy,  the  expectation  and  variance  of  are 
finite.  Condition  (8)  requires  the  existence  of  a diagonal  matrix  A s.t. 
lim  EIIa  -All  = 0 or  in  other  words 


to  -Zj.fto,]  -A.,]2=o.  *=  1,2,3 


Since  A^^  = E[Ap]^  it  suffices  to  show  Var  -»  0 as  p -»  » for  all  i.  But 


P “ 


jto3]  = 0(-) 


provided  the  latter  variance  is  bounded.  Extension  of  this  procedure  to  n arbitrary 

quantiles  requires  that  m be  significantly  larger  than  n and  that  [K,m]  be  used  instead 
f r if"  T ^ 

of 

Simulation  of  this  procedure  was  carried  out  for  the  quantiles  of  the  Laplace  dis- 
tribution using  m=  16,  i=  4,8,  12,  The  algorithm  was  started  by  using  a single  batch 

sample  of  size  16  to  estimate  0.,  i=  1,2,3  and  l/2f(0.).  The  results  of  this  simulation 

^ th  ^ 

are  plotted  in  Figures  2 and  3.  After  the  120  iteration,  i=  1,2,3  are  greater 

than  the  optimum  values  and  close  enough  to  ensure  a near  optimum  convergence  rate. 
Further,  observe  that  ^p^'  estimate  of  the  median,  appears  to  be  biased  slightly 
positive  due  to  the  random  number  of  generator  used  in  the  simulation. 
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M- LEV  EL  QUANTILE  DETECTORS 
B.  Gotz  and  L.  Kurz 

The  basic  advantages  of  partition  detectors  based  on  quantile  statistics  are  well 
1-7 

established.  Some  difficulties  in  simple  implementation  and  operation  of  this  type 


of  detector  in  an  adaptive  mode  were  recently  resolved  for  regular  and  mixture  dis- 

B 9 

tribution  noise  environments.  ’ The  intent  of  this  research  is  to  extend  the  theory  of 
partition  detectors  to  M-level  systems.  Two  types  of  M-ary  detectors  are  considered 
--a  generalized-sign  equiprobable  partitioning  detector  (GSEPD)  and  a generalized- 
sign  maximum-mean  partitioning  detector  (GSMMPD).  Unlike  in  the  binary  case,  where 
the  detector  performance  is  evaluated  on  the  basis  of  the  asymptotic  relative  efficiency 
(ARE)  in  the  Pitman -Noether  sense,  for  evaluation  of  M-level  detectors  difference 
statistics  with  the  associated  efficacies,  cross -efficacies  and  correlation  coefficients 
are  introduced.  ' 

A.  Two  Generalized  Sign  Detectors 

Consider  a simple  generalization  of  the  sign  detector  based  on  the  test  statistic 
L m 


m 


^ = r ^ ^"i  = r S S 

j=l  i=l 


(1) 


i=l 


where  {x^l  are  L i.i.d.  samples  of  observed  data,  {a^}  are  m quantiles  of  the  distribu- 
tion under  the  hypothesis  (usually  if  noise  only  is  present),  are  the^  occupancy 

indicators  of  the  number  of  samples  between  two  adjacent  quantiles  or  ^ = L,  and 

i=  1 


u(-)  is  the  unit  step  function.  It  is  easy  to  show  that  for  the  detector 
the  efficacy  for  shift-of-the-mean  alternatives  reduces  to 


of  Eq.  (1), 


1^. 


*^.M.  " 


i=l  


m 


(2) 


^ i'^[F(a  ) - F(a.  j)]  - [V  i[F(a^  - F(a.  j)]]' 
i=  1 i=  1 


where  F(-)  is  the  distribution  under  the  hypothesis,  f(-)  is  the  corresponding  probability 
density  function  (p.d.f. ) and  the  quantiles  {a^^]  are  as  yet  unspecified.  Two  choices  of 
quantiles  result  in  two  useful  detectors: 

(1)  The  choice  of  partitions  is  equiprobable  or  F(a^)  - F(a^_^)  = ^ in  yielding 
a generalized-sign  equiprobable  partitioning  detector  (GSEPD) 

(2)  The  choice  of  the  quantiles  is  such  that  the  mean  of  the  generalized  sign 

statistic  is  maximized  or  F(a.)  - F(a._j)  = and  F(a.)  = 

yielding  a generalized-sign  maximum -mean  partitioning  detector  (GSMMPD). 


L 
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For  GSEPD,  Eq.  (2)  reduces  to 
12  ^ 

= TT  ) - £(a  )]}^  „ 

If  m = 2 and  f(a^)  = = 0.  GSEPP  becomes  a sign  detector  and  = 4fha  ) 

--a  well  known  result.  S.M.(l)  1 


For  the  GSMMPD,  Eq.  (2)  reduces  to 
18 

m +m-2  i=i  * ^ 


(4) 


Numerical  calculations  have  been  performed  for  both  detectors  under  gaussian 
noise  conditions.  The  ARE  with  respect  to  optimum  detector  of  both  detectors  are  over 
90%  for  m = 5,  with  GSMMPD  having  a slight  edge  over  the  GSEPD.  The  ARE  in  non- 
gaussian  noise  with  respect  to  the  locally  most  powerful  detector  for  gaussian  noise 
frequently  exceeds  unity.  As  compared  to  rank  detectors,  both  detectors  have  slightly 
reduced  efficiency  in  gaussian  noise  but  improved  robustness  if  noise  distributions  vary, 
In  addition,  the  two  detectors  are  simple  to  implement  and  do  not  require  expensive 
ranking  of  the  observables. 


B.  Generalized  Sign  M-ary  Detection 

UnUke  in  the  binary  detection  problem,  a single  partitioning  based  on  the  noise 
distribution  is  replaced  by  M partitionings,  one  "matched"  to  each  hypothesis.  For 
each  hypothesis,  H.,  i = 1,  ....  m,  a partitioning  vector  a.  and  a statistic  T.  must  be 
selected.  In  general,  for  each  hypothesis  the  test  statistic  is  of  the  form  ^ 


T. 

X 


1 

^k=l 


(5) 


where  the  weight  vector  ? of  positive  components  is  specified.  An  intuitively  appealing 
choice  of  the  partitioning  vectors  a^^  is  such  that 

and  Pj_j  = P,db[d, sinod 

Z Pi  i ~ optimum  a.  satisfy 

j=l  * 


F.(a.  .)  - F.(a  . ) = 

1 i.j'  i'  i,j-l' 


m 

z 

k=l 


1. 


m 


(7) 
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Since  q)  = 0,  by  iteration, 


F.(a.  .)  = 

i'  i,J 


til 


are  obtained. 


Specifically,  T^,  i = 1,  • • • , M may  be  selected  as  GSMMP  statistics  considered 
in  the  previous  subsection.  In  this  case 


1 ^ 
k=  1 

where  the  quantile  vectors  are  given  by  the  solution  of 


F (a  J = 
i*  i,k*  m(m+ 1) 


i = 1, 


, M and  k = 1 , • • • , m 


When  the  design  of  a detector  is  based  on  one  hypothesis  (e.g.  , the  null  hypothesis),  as 
is  the  case  in  binary  detection,  the  efficacy  does  not  depend  on  the  actual  hypothesis 
separation  but  only  on  the  local  rate  of  separation.  This  is  not  a meaningful  measure 
of  performance  when  M "matched"  detectors  are  used  in  an  M-ary  system.  If  we 
assume  stochastically  ordered  hypotheses  as  representing  the  M-ary  system,  the  major 
contribution  of  error  results  from  a decision  on  a hypothesis  adjacent  to  a true  hypoth- 
esis. Thus,  we  need  a measure  of  performance  of  two  adjacent  statistics,  T^  and  T^. 
We  replace  the  relative  comparison  to  equivalent  threshold  comparison  based  on  the 
difference  statistic 

T = T.  - T.  (11) 

1 J 

The  efficacy  associated  with  the  difference  statistic  is  then  a meaningful  measure  of 
performance  of  a "matched"  type  M-ary  detector.  It  can  be  easily  shown  that  the 
efficacy  of  T is 


= lim 


2 ^ 2 - 
m.  + m . - 2 m.m . 
I J 1 J 


2m. m.  -,-1 


L-*oo  L(o.  +0.  - 2 po. o.)  L-*<» 

1 J 1 J 


.m.  r n x&J.  it].  til. 111. 


2m. m.  -,-1 


tm  ttJ.  til.  Cvtil.tiJ. 


Lor  La. 
J J 


p 111. 

-2‘>[r^ 


Lo.  o. 

1 J 


2m. m.  -,-1^-1 

} 

1 J 
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where  mj,  = M 

var  T = o.  + a? 

I J 


i 


- 2 p a a. 

^ I J 


or  j, 


6 - a signal  parameter,  and 


with 

= var  Tj^,  k = i or  j,  p = E { (T . - T.)(T.  - T.)}/o.a. 

Introducing  self-  and  cross -efficacies 

2 2 

m.  m^ 

ij. . = lim  — ^ ,($..=  lim  ^ 

L-x*  Lo.  L-^oo  Lo. 

I J 


the  efficacy  of  the  difference  statistic  is 


>T  = C(^u 


- «..) 

ij  JJ 


- 2 p((?. . 5. . + S..S.. 

11  ij  ji  JJ 


11  JJ 


(13) 


The  self-  and  cross -efficacies  in  Eq.  (13)  are  easy  to  calculate,  The  calculation 
of  p in  specific  cases,  though  laborious  as  will  be  shown  in  the  next  section,  does  not 
create  extreme  difficulties.  The  difference  efficacy  of  Eq.  (13)  may  be  used  in  the 
same  fashion  as  transition  probabilities  in  parametric  detectors  to  evaluate  the  perform- 
ance of  M-ary  partition  detectors. 


C.  Efficacy  of  the  Generalized  Sign  M-ary  Detectors  for  the  Class  of  Lehman 
Alternatives  ~ 

Before  the  efficacy  of  Eq.  (13)  can  be  evaluated  for  the  class  Lehmann  alternatives, 
the  first  and  second  moments  of  the  statistic  T^  of  Eq.  (5)  must  be  calculated.  Consider 
the  statistic 


T'  = lT  = 


i=l 


b.n. 
1 1 


It  can  be  shown  that  the  characteristic  function  of  T is 
_m  jb.v-.L 

*T'<-)=[X^Pi"  ' J 


(14) 


(15) 


where  p.  is  defined  in  connection  with  Equation  (6).  Applying  the  change -of -scale 
theorem  to  Eq.  (15)  and  using  the  method  of  characteristic  functions,  the  mean  and 
variance  of  T,  given  by  Eq.  (14),  are 


m 


= Z Pi^i 
i=  1 


(16) 


JL.  - 


^ ..  — 
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and 


1 *"  ->  ■> 

var 


(17) 


If  in  the  difference  statistic  of  Eq.  (11)  = Tj  and  = T^.  ior  the  class  of 

Lehmann  alternatives  under  the  hypotheses  i = 1 or  2 

, . ^1+0,  V (l  + 0)inF(x) 

:Fj(x)  = F(x)  and  F2(x)  = F (x)  = e 

where  0 is  the  nonzero  parameter  representing  the  signal.  For  small  0 and  = j,  it 
follows  from  Eq.  (16)  that 


m 


.).F(..,j.,)l.F(a.,..,W  1=1  or  2 
For  i = 1,  Eq.  (18).  after  substitution  from  Eq.  (10),  reduces  to 


m.  = E[T.]  = ■ ''"i.j-l'*“"'“i.j-l' 


(18) 


which  with  the  notation  c^  = becomes 


m 


l = |^j(c.lnc.  -c.,jinc._j 


1 

For  i = 2.  F*+®(a2^.)  = c.  or  F(a^  .)  = Thus 

1111 

V T T+0,  1+0 

”’2=Xjjh  '^j-i^""j  -1 

Similarly,  the  variances  are 

2 ^ .2  2j  Tv  2i  m^  + m-2 

^*^1  " X,  ^ m(m+l)  * m(m+l)  J ' 18 

J”  1 J" 

j- 1 J 

The  efficacies  and  cross  efficacies  are  then 

m?  i = 1 , 2 

(J..  = 

"•>  Lo?  j = 1.2 

J 


and 


1*.  ..t.  , s 
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It  remains  to  calculate  the  correlation  coefficient 

ECTjT^]  - ECTj]  ECT^] 


where  all  expectations  are  with  respect  to  Hj.  i.e. , the  reference  distribution,  F(x). 
Since  the  variances  have  been  calculated  and  the  means  are 


m ^ m - , 

ECxy  Hj]  = Xj  •>Pl,  j " m(m+l)  ^ 


and 


It  remains  to  calculate  ECT^T^/  H^].  The  cross -expectation  is 


m m 


eCTjTj/h,]  = -^X, 

which  after  some  mathematical  manipulations  reduces  to 

m m 


:[TjT2/h^]  = E[Tj/Hj]  E[T2/H2]  + ^ X ij  Prob [x e Aj  . fl  A2^ 


where 


■^k,  f ""  f - 1 • ®k,  f ^ 

To  complete  the  evaluation  of  the  correlation  coefficient,  an  expression  for 
Prob[xe  A.  , fl  A_  .]  must  be  found.  These  probabilities  depend  on  the  signal  para- 
meter  6 in  a highly  nonlinear  fashion.  The  important  fact  is  that  as  0 increases  from 
zero,  the  m partitioning  intervals  monotonically  drift  to  the  right  (both  ends  of  the 
interval  drift  to  the  right  irrespective  of  what  happens  to  the  interval  length).  For 
small  0,  A,  . n.  A_  . = <(>  except  for  j = i,  i=l,  •••,m  and  j = i=l,i=2,  •••,  m. 
follows  that 
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Prob  [x  e A • ll  A-  .]  = 

At*  ^t  J 


1 

F(3j  .)  - FCa^  ._j)  = c.  - for  j = i;  i = 1 , • • • , m 

1 

F(a2i.^)-F(aj  i ^)=c^-Ci_i  for  j = i - 1 ; i = 2.  • • • . m 
0 otherwise 

Thus,  all  components  of  the  efficacy,  5^,  of  Eq.  (13)  for  Lehmann  alternatives  are 

calculated.  It  should  be  noted  that  there  exists  a range  0 < 0^.^  for  which  these  results 

are  valid;  namely,  0 corresponds  to  the  point  where  a^  ^ crosses  over  to  the  right 

of  a,  ....  It  is  easy  to  show  that 

1,1+1  ' 


in 


r m(m+l)  1 

L ~ 


max 


= max 


EHEA., 


m - 1 


2<i<m  . rm(m+l)"| 
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SIGNAL  SELECTION  AND  OPTIMUM  SIGNAL  DESIGN  USING  ROBUST  PARTITION 

STATISTICS 

P.  Kersten  and  L.  Kurz 

The  statistical  properties  of  the  m -Interval  Partition  Detectors  and  their  variants 
are  well  developed.  A concise  exposition  of  the  present  state  of  the  art  may  be  found 
in  Reference  2.  In  particular,  the  linear  m -Interval  Partition  Test  is  nonparametric, 
robust,  simple  to  implement,  and  in  its  most  powerful  form  operates  with  little  loss 
of  efficiency  in  comparison  to  the  locally  most  powerful  detector.  Moreover,  in  se- 
vere impulsive  noise  environments,  the  m -Interval  Partition  Detector  retains  its 
robustness.  Bushnell^  has  considered  the  performance  of  the  m-Interval  Partition 
Detector  (MIPD)  in  Gaussian  and  impulsive  noise  and  compared  it  to  the  performance 
of  the  likelihood  ratio  detector  designed  for  the  Gaussian  noise.  The  results  of  this 
study  support  the  claim  that  the  MIPD  retains  its  robustness  even  if  the  noise  distribu- 
tion is  replaced  by  the  mixture  model  (Gaussian  and  impulsive  noise). 

In  this  report,  the  theory  of  the  m -Interval  Detector  is  extended  to  handle  side 
constraints  on  non-constant  signals  and  optimization  of  the  associated  detector.  Using 
the  derived  locally  most  powerful  scores,  the  performance  of  this  detector  is  investi- 
gated. A discrete  formulation  is  used  to  select  the  signal  and  correlation  function  of 
the  detector  to  optimize  a performance  index  which  reflects  the  system  constraints. 
Though  the  detector  appears  to  be  similar  to  a conventional  correlation  detector,  it 
retains  the  robustness  properties  of  the  m -Interval  Detector. 

A.  Problem  Statement 

Consider  the  system  block  diagrammed  in  Figure  1.  Unlike  the  usual  detector 
for  non-constant  signals,  where  one  specifies  the  form  of  the  signal  and  obtains  the 
locally  most  powerful  scores,  both  the  signal  and  the  correlation  function  (which 
corresponds  to  d^^  of  Ref.  2)  remain  unspecified.  The  m -Interval  Partition  Test  is  ex- 
pressed in  the  form  (Although  the  form  of  T in  Eq.  (1)  appears  to  be  that  of  a Unear 
matched  filter,  it  possesses  the  robustness  of  the  m-Interval  Partition  Statistic.) 

T = - y g(k)h(X(k))  (1) 

" k=l 

where  X(k)  = n(k)+  s(k).  Its  relation  to  the  form  of  the  statistic  used  in  Ref.  2 will  be 
estabUshed  later.  The  function  h(  ) is  assumed  to  be  specified  and  the  i.i.d.  noise 
samples  are  denoted  by  n(k)  and  the  index  k.  The  question  is  posed:  Given  a perform- 
ance measure,  what  choice  of  signal  s(k)  and  correlation  function  g(k)  optimizes  the 
performance  measure  of  the  system?  This  problem  statement  is  one  of  the  many  forms 
of  the  Discrete  Signal  Design  Problem,  and  is  not  complete  until  both  the  performance 
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Fig.  1.  Plot  of  r(x)  and  v(x)  vs.  x for 
the  LMP  scires  and  standard 
normal  distribution  using  quantile 
0.125,  0.26,  0.375,  0.5,  0.625,  0 
0.750,  0.875 

measure  and  the  constraints  imposed  by  the  system  are  specified.  In  practice,  the 
designer  has  to  ensure  that  the  problem  statement  represents  a good  approximation  to 
the  actual  physical  system  and  to  the  constraints  that  motivate  the  problem;  on  the  other 
hand,  he  must  also  relax  and/or  tighten  various  conditions  to  either  obtain  non-trivial 
solutions  or  limit  the  number  of  solutions.  The  discrete  signal  approach  as  explained 
by  Silver  and  Kurzf  has  several  advantages.  In  many  systems,  the  detection  scheme 
is  actvially  applied  to  sampled  data  and  so  this  approach  better  models  many  actual 
physical  systems.  The  discrete  version  of  the  optimization  problem  is  often  more 
amenable  to  solutions.  In  addition,  the  increased  usage  of  digital  signal  processing 
techniques  in  information  processing  systems  further  motivates  the  discrete  formula- 
tion of  this  problem. 

To  relate  the  form  of  the  m -Interval  Partition  Test  given  by  Eq.  (1)  to  the  usual 
form,  one  equates  the  expected  value  of  both  of  these  forms  to  obtain 

m 

E[hts(k)  + n(k)}]  = V cjF(a.  - s(k))  - F(a._j  - 
X=1 


s(k))l  = Eh(s(k)) 
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and  thus  = g(k)  where  the  notation  of  Ref.  2 is  used.  The  expected  value  of  T is  then 
, n 

E[T]  = i ^ g(k)  E[h(s(k))]  . 
k=l 

The  variance  of  T as  given  by  Eq.  (I)  is 

, 1 n n 

Var(T)  = E(T  - ET)*^  = ^ Z Z 

n j=l  k=l 

where  R(j,k)  = E{[h(X(j))  - Eh(X(j))]  [h(X(k»  - Eh(X(k))]].  Since  the  noise  samples  n(j) 
and  n(k)  are  zero  mean  i.i.d.  random  variables,  or  E[n(j)n(k)]  = where  is  the 

Kronecker  delta.  Define  r(s(k))  = E[h(X(k))]  and  w(s{k))  = Eh  (X(k))  and  v(s(k))  = 
w(s(k))  - r^(s(k)).  Using  these  definitions  in  Eq.  (2)  it  follows  that 

nVar(T)  = ^ Z Z ^ ^ g^(j)v(s(k)) 

j=l  k=l  •’  j=l 

Nawrocky^  has  considered  the  design  of  optimal  signals  and  detectors  to  minimize 
the  effects  of  noise  and  intersymbol  interference  in  the  transmission  of  digital  data 
using  the  discrete-time  approach.  This  formulation  allowed  him  to  directly  implement 
the  results  in  simple  digital  structures.  This  particular  formulation  of  the  problem 
optimizes  a cost  functional  which  incorporates  the  signal-to-noise  ratio  and  the  Gabor 
bandwidth  criterion.^  Following  Nawrocky,  one  seeks  to  maximize  the  functional 

j'=hA  Z g(k)r(s(k))  - hB  ^ g‘^(k)  v(s(k))  - h C Z ('^g(k)) 
k=  o k=  o - k=  o 


- hD 


K-1 

Z (As(k))' 
k=o 


Sl- 


-S3- 


(3) 


where  h = l/K  = the  stepping  increment  and  A is  the  forward  difference  operator.  The 
first  term  is  the  expected  value  of  the  statistic  T and  the  second  term  is  KVar(T). 
The  third  term  S3  is  a measure  of  the  bandwidth  of  the  correlation  function  and  the 
fourth  term  is  a measure  of  the  bandwidth  of  the  signal. 

The  functional  j'  is  maximum  when  the  last  three  terms  of  Eq.  (3)  are  minimized 
and  the  first  term  is  maximized  subject  to  the  constraints  that  will  be  specified  later. 
Physically,  this  means  the  bandwidth  of  both  the  signal  and  the  correlation  function  are 
to  be  minimized  along  with  KVar(T)  of  the  statistic.  In  contrast,  the  expected  value 
of  the  statistic  T is  to  be  maximized.  The  s(k)  and  g(k)  which  optimize  j'  maximize 
the  signal-to-noise  ratio  of  the  system  while  reducing  the  bandwidth  of  both  s(k)  and 
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g(k).  The  definition  of  both  r(s(k)  and  v(s(k))  involve  the  CDF  of  the  underlying  noise 

distribution  and  therefore  the  signal  problem  is  dependent  upon  the  underlying  noise 

distribution  as  well  as  the  scores  of  the  statistic  T.  Accordingly,  assume  $(x)  is  the 

3 

standard  normal  CDF  and  c^^  are  the  locally  most  powerful  scores.  Bushnell  has  shown 

that  with  Gaussian  scores  that  the  linear  m -Interval  Partition  Test  performs  well  even 

in  Gaussian  and  Impulsive  noise  environments.  Thus  Gaussian  scores  are  a good 

initial  approximation  for  studying  this  optimization  problem  since  the  performance  of 

the  statistic  is  relatively  insensitive  to  perturbations  in  the  Gaussian  assumption. 

Figure  1 contains  plots  of  both  r(x)  and  v(x)  for  this  case.  Note  that  the  first  two 

2 

derivatives  of  both  r and  v are  continuous  and  bounded,  i.e.  , r,  ve  C . Since  h is  com- 
mon to  all  the  terms  in  J^,  J = 3‘ / h is  maximized  instead  of  Z' . To  obtain  a physically 
meaningful  solution,  energy  constraints  must  be  placed  upon  both  g(k)  and  s(k).  That 
is  both 

K-1  , ..  1^1  p .. 

Yj  s (k)h  and  ^ g (k)  h 
k=  o k=  o 

are  required  to  be  equal  to  given  constants.  Couching  the  above  in  the  terminology  of 
optimal  control,  allows  application  of  all  the  optimization  techniques  developed  for 
control  theory  to  the  signal  design  problem.  Define  x^(k)  = s(k)  and  let 

Uj(k)=Ax^(k)  i.e.,  Xj(k+ 1)  = x^(k)  + hUj(k)  (4) 

2 

Thus  Uj(k)  is  a measure  of  the  Gabor  bandwidth  of  s{k).  To  express  the  energy  con- 
straint upon  s(k),  let 

Ax^ik)  = x^(k)  or  X2(k+  1)  = + hxj(k)  (5) 

so  that 

K-1  K-1  2 

Y Ax^Ck)  - Y ^ ~ power  of  the  signal  s(k). 

k^l  k=  1 

In  like  manner  define  X2(k)  = g(k)  and  let 

U2(k)  = AXj(k)  or  Xj(k+  1)  = Xj(k)  + hu^ik)  (6) 

2 

so  that  ^^{k)  is  a measure  of  the  Gabor  bandwidth  of  g(k).  In  addition, 

Ax^(k)  = Xj(k)  or  x^(k+  I)  = x^(k)  + hx^(k)  (7) 

im.plement  the  energy  constraint  upon  x^(k). 


■A 


ii tii'fli  ^^|  t h i^iiii  i~ 
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T 

Passing  to  vector  notation  define  x(k)  = (xj^(k),  X2(k),  Xj(k),  x^(k)),  as  the  state 
vector  and  u(k)'^  = (u^(k),  u^lk))  as  the  control  vector  of  the  system  defined  by  Equations 

(4)  to  (7).  The  initial  and  final  values  of  the  state  vector  are  then  given  by  x(0)  = 0 and 

T ” 

x(K)  = (0,  E^,  0,  Eg)  respectively.  J is  then  expressed  by 

K-1 

J = ^ L(x(k),  u{k)]  (8) 

k=o 

where  L(x(k),  u(k))  = Ax^(k)  r(xj(k))  - B X2(k)  v(Xj(k))  - Cu^lk)  - DUj(k)  and  the  system 
equations  can  be  expressed  in  a generic  form  as 

x(k+  1)  =^{x(k),  u(k))  (9) 

The  above  formulation  permits  the  direct  application  of  the  Gradient  Projection 
Algorithm.  In  fact,  the  algorithm  described  by  Dyer  and  McReynolds  for  implement- 
ing this  method  is  directly  applicable  provided  the  vector  of  parameters  a is  set  equal 
to  zero  (Ref.  7,  pp.  57-61).  This  algorithm  implements  the  gradient  projection  method 
by  iteratively  calculating  the  proper  gradients  with  respect  to  the  control  variables  and 
thus  producing  an  efficient  computer  program.  This  approach  is  considered  in  Section 
B. 

B.  The  Gradient  Projection  Method  for  Solution  of  the  Signal  Design  Problem  Using 
the  m -Interval  Partition  Statistic 

The  Gradient  Projection  Algorithm  (GPA)  is  simple  in  concept  although  complex 
in  analysis  and  application.  This  method  optimizes  a functional  subject  to  constraints 
by  an  iterative  procedure  which  is  a variation  of  the  steepest  ascent  (descent)  method. 
This  algorithm  replaces  the  gradient  of  the  functional  J with  a projected  gradient  where 

g 

the  projection  is  upon  the  feasible  region. 

The  implementation  of  this  method  by  Dyer  and  McReynolds^  employs  a format 
which  is  amenable  to  digital  computation.  Their  approach  is  to  treat  the  gradient  pro- 
jection method  via  the  method  of  Lagrange  Multipliers  using  the  constraints  to  form  an 
augmented  functional.  To  be  specific,  let  L(a)  represent  the  function  which  one  wishes 
to  maximize  and  let  M(  a ) = 0 be  a vector  constraint  to  be  satisfied  by  any  feasible 
point.  One  constructs  the  augmented  functional 

J*  = L + (M  - Q)^  v*'  ( 10) 

where  Q is  a vector  representing  the  constraint  level  (normally  taken  as  zero  in  the 
standard  Lagrange  formulation)  and  v represents  a vector  of  multipliers  at  the  k-th 
iteration.  One  then  proceeds  in  obtaining  the  gradient  correction  to  a feasible  .solution 
a at  the  k-th  step  in  precisely  the  same  manner  as  with  the  steepest  ascent  method. 
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This  yields, 

£ k+1  k , T T . ,,T  k. 

6q=q  -q  = e(L  +M  V ) 

— — — ' Q Q — 


(11) 


where  c = stepping  parameter  and  is  the  partial  derivative  of  L with  respect  to  the 
vector  of  parameters  a which  is  a nxl  vector  since  Dyer  and  McReynolds  define 
= (9L/9aj,  ■ • • , 9L/9q^).  Following  their  notation,  M^  is  a matrix  [9M./9q.], 
j = 1,  • • • , n;  i = 1,  • • • , q.  The  vector  ^ is  then  calculated  to  ensure  a predetermined 
small  change  in  the  constraint  level  Q.  That  is,  6Q  is  proportional  to  the  error  in  the 
constraint  M(a)  = £.  Neglecting  higher  than  linear  terms  in  approximation  of  the 
functional,  6Q  = M^  6 a,  and  substituting  6 a from  Eq.  (11)  into  this  equation  yields 


6Q  = e(M„  + M„m’^v*^)1  , 

— ' — a — Q — Q — Q — ' k 

I a=  a 

which  when  solved  for  v gives 

v‘^=(M  m’^)'^(6Q/e  - M . 

— ' — a — a'  ' — ' — a — a' 


(12) 


Thus  to  implement  this  method,  one  uses  the  recursion 


k+1  k , k, , , , k.T,.  .T 

a =a+e(L+(v)M) 


where  v is  given  by  Equation  (12). 


(13) 


Application  of  this  procedure  to  the  signal  design  problem  under  consideration  is 
straightforward  in  principle  although  the  details  are  intricate.  The  exact  algorithm  is 
contained  in  Reference  7.  A computer  program  was  implemented  using  this  algorithm 
and  run  with  A = B=  C = D=l.  The  resultant  shapes  of  both  s(k)  and  g(k)  are  that  of  a 
sinusoidal  pulse  as  plotted  in  Figure  2.  However,  when  A=  100,  B=  C = D=  1,  the  curve 
for  g(k)  and  s(k)  are  different  from  each  other  and  different  from  the  sine  pulse,  al- 
though the  perturbation  from  the  sinusoid  is  small.  These  curves  are  plotted  in  Figure 
4.  Note  that  s(k)  has  steeper  sides  at  the  beginning  and  end  of  the  pulse.  This  is  to  be 
expected  since  r(s(k))  is  near  zero  for  zero  signal  and  one  expects  that  in  order  to 
increase 


K-1 

Yj  g(k)r(s(k)) 
k=o 


that  s(k)  will  have  to  increase  faster  at  the  beginning  and  decrease  faster  at  the  end, 
but  remain  flatter  in  the  center.  This  allows  more  contributions  to 

K-1 

Yj  g(k)r(s(k)) 
k=o 
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Plot  of  g(t)  and  s(t)  vs.  t for  A=B=C=D=  1 , Fig.  3.  Plot  of  s(t)  and  2 sinir  t vs  . tforA=100, 
h=0.  1 and  LMP  scores.  Plot  is  2 sin  2 t B=C=D=1,  h=0.  1 and  standard  normal 

to  several  decimal  places,  N=25  iterations  and  LMP  scores.  Gradient  projection 

of  the  gradient  projection  algorithm.  method. 
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from  the  initial  and  final  values  of  the  summation  while  still  satisfying  the  energy  con- 
straints. In  essence,  these  results  imply  that  the  signal  design  is  not  sensitive  to  the 
relative  importance  of  the  constraints  and  the  signal-to-noise  ratio,  i.e.,  the  types 
of  constraints  are  the  important  factors  in  the  functional  for  determining  the  shape  of 
s(k)  and  g(k).  This  algorithm  converges  quickly  (10-15  iterations)  since  u^(k)  and  u^(k) 
are  known  to  be  essentially  cosines  and  are  so  chosen  to  initiate  the  program.  The 
initial  starting  values  were  obtained  from  an  application  of  the  discrete  maximum 
principle  to  a simplified  version  of  the  same  problem.  Though  this  problem  was 
analyzed  for  Gaussian  additive  noise,  the  method  of  solution  is  applicable  to  other  noise 
distributions  possessing  a continuous  p.d.f.  The  procedure  and  the  associated  soft- 
ware are  sufficiently  general  and  flexible  to  include  other  constraints  without  any 
difficulty. 

Joint  Services  Technical  Advisory  Committee 

F44620-74-C-0056  P.  Kersten  and  L.  Kurz 

REFERENCES 

1.  Y.C.  Ching  and  L.  Kurz,  "Nonparametric  Detectors  Based  on  m-Interval  Partition- 
ing," IEEE  Trans,  on  Information  Theory,  IT- 18,  No.  2,  251-257  (March  1972). 

2.  L.  Kurz,  "Nonparametric  Detectors  Based  on  Partition  Tests,"  Supplementary 
Notes,  EE704,  PINY  (September  1975). 

3.  W.  J.  Bushnell,  "Optimization  and  Performance  of  Detectors  Based  on  Partition 
Tests,"  Ph.  D.  Dissertation,  Polytech.  Inst,  of  New  York  (1975). 

4.  H.I.  Silver  and  L.  Kurz,  "A  Class  of  Discrete  Signal  Design  Problems  in  Burst 
Noise,"  IEEE  Trans,  on  Information  Theory,  IT-18,  No.  2 (March  1972). 

5.  R.  J.  Nawrocky,  "A  Discrete  Time  Approach  to  Signal  and  Detector  Design  in  Ad- 
ditive Gaussian  and  Burst  Noise  and  Intersymbol  Interferences,  " Ph.  D.  Disserta- 
tion, Polytech.  Inst,  of  New  York  (1975)  and  R.J.  Nawrocky  and  L.  Kurz,  "Design 
of  Digital  Filters  for  Extraction  of  Signals  for  Inter-Symbol  Interference  and  Noise,  " 
8th  Annual  Conference  on  Information  Sciences  and  Systems,  Princeton,  N.  J. 

(March  28-29,  1974). 

6.  D.  Gabor,  "Theory  of  Communications,"  J.  Inst.  Electrical  Engineers  (London), 

93,  Part  3,  427-457  (November  1946). 

7.  P.  Dyer  and  S.R.  McReynolds,  "The  Computation  and  Theory  of  Optimal  Control,  " 
Academic  Press,  N.Y.  (1970). 

8.  J.B.  Rosen,  "The  Gradient  Projection  Method  for  Non-Linear  Programming.  Part 
I.  Linear  Constraints,  " J.  of  Soc.  for  Industrial  and  Applied  Math.  , No.  15 
(March  I960). 

9.  P.  Kersten,  "Robustized  Recursive  Estimation  and  Adaptive  Partition  Detectors," 
Ph.D.  Dissertation,  Polytech.  Inst,  of  New  York  (1976). 


M 


300  COMMUNICATIONS 

A NOVEL  FM  DETECTOR  FOR  SUPPRESSION  OF  INTERCHANNEL  INTERFERENCE 
F.A,  Cassara  and  H.  Schachter 

Nuroerous  investigators^  ^ have  focused  their  attention  on  the  problem  of  inter - 
channel  interference  in  FM  receivers.  Some  studies  specialize  to  analog  FM;  others 
to  digital  frequency  or  phase  shift  keyed  signals.  All  studies  conclude  that  the  presence 
of  interfering  signals  degrade  the  quality  of  reception  severely. 

Interfering  signals  can  arise  in  the  following  general  areas; 

(1)  Co-channel  interferer,  e.g.,  (a)  neighboring  transmitters  sharing  the  same 
frequency  band;  (b)  spurious  signals  such  as  an  image  channel  in  a super- 
heterodyne receiver;  (c)  multipath  echoes 

(2)  Adjacent  channel  interferer,  e.g.,  (a)  inadequate  selectivity  in  the  receiver's 
IF  filter;  (b)  "crowding"  in  the  radio  frequency  spectrum;  (c)  spurious  signals. 

In  this  report  a novel  FM  detector  which  has  demonstrated  capability  in  suppres- 
sing the  degradation  in  receiver  performance  due  to  the  presence  of  an  interfering 
signal  is  presented. 

A.  The  Novel  FM  Detector 

The  block  diagram  of  the  novel  FM  detector  is  shown  in  Figure  1.  The  principle 
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Fig.  1.  Novel  FM  detector  for  suppression 
of  interchannel  interference. 

of  operation  can  be  described  as  follows:  assume  the  input  signal  consists  of  a frequency 
modulated  carrier  plus  a frequency  modulated  interferer . The  interferer  to  desired 
carrier  amplitude  ratio  is  denoted  as  q . Phase -locked  loop  (PLL)  ^1  locks  on  to  and 
tracks  (by  the  capture  effect)  the  stronger  of  the  two  received  FM  signals  but  its  volt- 
age controlled  oscillator  output  signal  (VCO  #1)  lags  by  approximately  90®.  An  addition- 
al 90°  phase  shift  is  introduced  by  phase  shifter  *Z  so  that  the  signal  appearing  at  phase 
shifter  *Z's  output  is  180°  out  of  phase  with  respect  to  the  stronger  received  signal. 
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By  proper  adjustment  of  the  gain  constants  of  summer  #2  the  stronger  received  signal 
is  cancelled  leaving  only  the  weaker  of  the  two  received  FM  signals  at  the  input  to 
PLL  #2.  The  instantaneous  phase  of  V CO  4Z  output  signal  tracks  the  instantaneous 
phase  of  the  weaker  signal  but  lags  by  90°.  An  additional  90°  phase  shift  is  introduced 
by  phase  shifter  #1  producing  a signal  at  the  output  of  phase  shifter  #1  which  is  180 
out  of  phase  with  respect  to  the  weaker  of  the  two  received  FM  signals.  The  weaker 
signal  can  thus  be  cancelled  at  summer  #l  leaving  only  the  stronger  signal  appearing 
at  the  input  to  PLL  #1.  Since  this  novel  detector  has  two  separate  outputs  --  namely 
the  outputs  of  the  individual  phase-locked  loops  it  possesses  the  capability  of  demodula- 
ting both  the  stronger  and  the  weaker  received  signals  even  though  they  may  be  co- 
channel and  share  the  same  frequency  band.  This  is  a task  impossible  with  any  of  the 
other  existing  FM  detectors  since  they  all  obey  the  well  known  capture  effect. 

B.  Experimental  Results 

Hardware  for  this  new  detector  was  constructed  using  monolithic  PLL  integrated 
circuits  (Signetics  562B)  to  detect  square  wave  FM  carriers.  The  novel  device  suc- 
cessfully demodulated  both  the  stronger  and  weaker  sinusoidally  modulated  FM  carriers 
for  the  case  where  the  stronger  signal  was  as  much  as  20  dB  greater  than  the  weaker 
carrier.  Figure  2 describes  the  response  of  the  novel  demodulator  and  limiter - 
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Fig.  2.  Response  of  Lim-Disc.  and 

novel  FM  detector  with  q = 0.5. 

discriminator  (General  Radio  1142-A)  when  simultaneously  driven  by  a 455  kHz  square 
wave  carrier  plus  a co-channel  455  kHz  FM  square  wave  interferer.  The  desired 
carrier  was  frequency  modulated  by  a 100 Hz  sinusoid  while  the  interferer  was  frequen- 
cy modulated  by  a 200 Hz  triangle  wave.  The  peak  frequency  deviation  of  both  carriers 
was  7 kHz.  The  interferer  to  desired  carrier  amplitude  ratio  was  set  at  q = l/2.  Each 
output  was  filtered  by  identical  6 kHz  bandwidth  post  detection  low  pass  filters  (Krohn- 
Hite  Model  3202).  Each  PLL  loop  bandwidth  was  designed  to  be  100  kHz  (much  wider 
than  the  7 kHz  peak  frequency  deviation  of  the  input  carriers).  In  addition  each  PLL 
was  essentially  a first  order  loop  with  a wide  (20  kHz)  bandwidth  single  pole  RC  low 
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pass  loop  filter.  The  results  of  Fig.  2 reveal  that  the  novel  detector  can  demodulate 
both  the  stronger  and  weaker  FM  carriers  with  considerable  improvement  over  the 
limiter -discriminator  which,  by  the  capture  effect,  demodulates  only  the  stronger  re- 
ceived FM  signal. 

More  quantitative  results  related  to  this  oscillogram  are  shown  in  Fig.  3 where 
the  detected  output  SNR  vs.  q is  plotted.  Noise  here  is  interpreted  as  distortion  ap- 
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Fig.  3.  Output  SNR  vs.  q for 
Lim-Disc.  and  novel 
FM  detector . 

pearing  in  the  detected  outputs  due  to  the  presence  of  an  interferer.  No  random  noise 
was  introduced  in  these  tests.  Here  we  see  that  the  novel  detector  offers  21  dB  improve- 
ment compared  to  the  limited-discriminator  over  a considerable  range  of  q . 

The  sensitivity  of  the  novel  detector  of  Fig.  1 to  amplitude  fluctuations  in  the  re- 
ceived signals  was  experimentally  evaluated  by  driving  the  demodulator  with  a 10  volt 
peak-to-peak  457  kHz  square  wave  carrier  frequency  modulated  by  a 100  Hz  triangle 
wave  (see  upper  trace  of  oscillogram  in  Fig.  4)  plus  a co-channel  457kHz  FM  square 
wave  interferer  amplitude  modulated  by  a 50Hz  sine  wave  and  simultaneously  freq. 
mod.  by  a 200 Hz  sinusoidal  signal  (second  trace  in  Figure  4).  Both  FM  carriers  had 
a 10  kHz  peak-to-peak  frequency  deviation.  The  third  and  fourth  traces  illustrate  the 
demodulated  outputs  of  PLL  #1  and  PLL  #2,  respectively.  Both  PLL  outputs  were 
filtered  by  identical  lOkHz  bandwidth  low  pass  filters.  As  can  be  seen  from  the  oscillo- 
gram the  weaker  FM  carrier  (second  trace)  varies  in  amplitude  from  a minimum  of 
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Fig.  4.  Response  of  novel  FM 
detector  of  Fig.  1 to 
AM/FM  interferer. 

2.  5 volts  peak-to-peak  to  a maximum  of  7 . 5 volts  peak-to-peak  due  to  the  50  Hz  sinu- 
soidal amplitude  modulation.  This  results  in  a variation  in  the  FM  carrier  amplitude 
ratio,  ri  , of  0.  25  < r|  <0.75.  Throughout  this  range  the  novel  detector  demodulated 
both  >.o-channel  FM  signals  quite  well. 

Figure  5 illustrates  the  block  diagram  of  a modified  receiver  structure  specifical- 
ly designed  to  accommodate  the  >-ase  when  the  applied  FM  signals  have  amplitudes  which 
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Fig.  5.  Novel  FM  detector 

modified  to  demodulate 
AM/FM  input  signals. 

vary  appreciably  such  as  occur  in  a "fading"  interferer  or  a multiple  interferer  signal 
environment.  In  the  latter  case  all  the  interferers  may  be  "lumped"  together  analytical- 
ly and  represented  as  one  AM/FM  interferer.  Modifications  to  the  novel  detector  of 
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Fig.  1 are  shown  in  dashed  lines.  The  input  signals  are  amplitude  modulated  by  r j(t) 
^*id  r ^(t)  and  simultaneously  phase  modulated  by  ilij(t)  and  shown  in  Figure  5. 

In  this  modified  receiver  structure  the  cancellation  signals  fed  back  are  no  longer  con- 
stant amplitude  signals  but  rather  AM/FM  signals  and  thus  are  capable  of  cancelling 
their  respective  input  signals.  The  basic  operating  principle  for  this  modified  structure 
is  the  same  as  that  presented  earlier  in  this  report.  No  experimental  studies  have  thus 
far  been  obtained  on  this  modified  receiver  structure  and  is  left  as  a study  for  future 
resear  ch. 

C.  Conclusions 

A novel  FM  detector  capable  of  demodulating  an  FM  signal  corrupted  by  a co- 
channel interferer  for  the  case  when  the  interferer  is  as  much  as  20  dB  stronger  than 
the  desired  FM  signal  has  been  presented.  Experimental  results  demonstrating  such 
capability  have  been  included.  Such  results  revealed,  among  other  things,  that  the 
demodulator  was  relatively  insensitive  to  variations  in  the  interferer  to  desired  carrier 
amplitude  ratio  . Future  studies  on  the  novel  detector  will  include  the  effects  of  the 
simultaneous  action  of  random  noise  and  an  interferer,  multiple  and  fading  interferers, 
and  digital  frequency  shift  keyed  signals. 
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OPTIMIZATION  OF  M-ARY  DIGITAL  TRANSMISSION 
OF  INTERSYMBOL  INTERFERENCE  AND  NOISE 

L.  Kurz  andR.J.  Nawrocky 


SYSTEMS  IN  THE  PRESENCE 


In  this  report,  the  problem  of  optimum  signal  and  detector  selection  in  the  pres- 
ence of  inter  symbol  interference  and  noise  as  formulated  in  Refs.  1 and  2 is  extended 
to  M-ary  transmission  systems,  where  the  signal  set  consists  of  M distinct  waveforms. 
Tutorial  treatments  of  M-ary  communication  problems  may  be  found  in  Helstrom^  and 
Weber.  The  problem  of  generating  M-ary  alphabets  with  useful  properties  in  burst 
noise  has  been  considered  by  Silver  and  Kurz^  using  amplitude  averaging  techniques, 
while  other  aspects  of  the  M-ary  problem  have  been  studied  by  Nuttall,^  Nuttall  and 
Amoroso,^  and  Nuttall  and  Floyd.® 

In  this  report,  the  posed  optimization  problem  is  approached  through  the  minimi- 
zation of  a general  system  cost  functional  based  on  the  noise-to-signal  ratio  criterion 
and  a cost  matrix  relating  the  properties  of  the  individual  signals  and  detectors.  In 
particular,  two  forms  of  the  cost  matrix  are  considered  leading  to  either  the  minimum 
probability  of  error  or  the  maximum  probability  of  detection  solutions. 

For  simplicity  of  the  presentation,  the  problem  is  initially  developed  for  the  all- 
pass channel  and  white  Gaussian  noise.  Subsequently,  the  formulation  is  extended  to 
include  a bandlimited  channel.  The  performance  of  M-ary  systems  in  additive  Gaussian 
noise  and  inter  symbol  interference  is  examined  and  compared  on  the  basis  of  a constant 
data  transmission  rate. 


A.  Development  and  Minimization  of  the  Cost  Functional:  All-pass  Channel 

Consider  the  general  M-ary  transmission  system  of  Fig.  1 for  communication 
over  a single  channel  in  additive  Gaussian  noise  environment.  For  simplicity  of  the 
presentation,  the  channel  is  assumed  to  be  all-pass.  An  extension  to  bandlimited  chan- 
nels is  treated  in  the  following  section. 


In  the  all-pass  case,  the  generalized  cost  functionals  for  a signal  s (k)  and  a 
detector  g^(k),  each  subject  to  the  Gabor  bandwidth  constraints  and  m=  1,2,  • • ‘,M, 
may  be  written  using  discrete  formulation  as 


K-1  , M 


J = A 
sm 


f s 


K-1 

= A V G (k) 
k=-^o 


(1) 


and 
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Fig.  1.  A model  of  an  M-ary  transmission  system. 

K-l  M K-l 

J = A V y s (k)a  g (k)  + Bg  (k)  y R{k,i)g  (j)  + 
gm  , L n'  ' mn^m'  ' ®m 


k=0  n= 1 


j=l 


+ ^d,  [V^-'^g  (k)]^}=Ay  G (k) 

,_J,  fm^  ®m'  J gm' 


f=l 


(2) 


respectively.  In  the  above  expressions,  the  coefficients  represent  the  cost  associ- 

ated with  the  correlation  between  the  m-th  signal  and  the  n-th  detector.  The  remaining 
notation  is  similar  to  the  notation  of  Reference  1.  The  overall  system  cost  functional 
can  be  expressed  in  terms  of  Eqs.  (1)  and  (2)  as 


M 

J = y [J  + J ] 

-J , sm  gm 

m=  1 ® 


(3) 


The  correlations  between  pairs  of  signals  and  pairs  of  detectors  may  be  included  as  side 
constraints 


V X (k)  = s (k)  s (k) 
smn  m'  n 


(4) 


and 


V X (k)  = g (k)  g (k) 
gmn  ®m  ®n' 


(5) 


for  a total  of  2M^  constraints.  Additional  (L  + L ) M constraints  arise  from  the  band- 

width  terms  in  Equations  (1)  and  (2).  For  L = L =1,  these  constraints  become 

s g 
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and 


V g_(k)  = v_(k) 


(6) 


(7) 


The  Hamiltonian  corresponding  to  a signal-detector  pair  can  be  expressed  in  terms 
of  Eqs.  (1)  through  (7)  and  adjoint  variables  Pm  ‘Ijn 


M 


H^(k)  = -A[G^^|k)  + Gg^(k)]  + <»>mn  + 

+ p_(k+  1)  [s^(k)  + Au  (k)l  + q_(k+  1) 


(fe) 


The  variables  ii>  and  <b  , which  can  be  shown  to  be  time -independent,  take  on  a 
mn  ^mn 

special  significance  in  this  formulation  because  they  represent  the  signal  and  the  detec- 
tor correlations  coefficients,  respectively.  The  overall  Hamiltonian  expressed  in 
terms  of  Eq.  (8)  becomes 


M 


H(k)  = y H_(k) 


m 

m=  1 


Maximizing  Eq.  (9)  with  respect  to  the  control  variables  yields 


' Im 


> 0 


V (k)  = -=-T q (k-t  1)  , ^ 

m 2d,  ^m  Im  — 

Im 

which,  in  connection  with  Eqs.  (6)  and  (7),  gives 
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The  adjoint  equations  are 
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n=  1 
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(14) 


q(k+  1)  = 

n=  1 


K-l 
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so  that  the  optimal  set  of  equations  for  a signal-detector  pair  consists  of  Equations  (IZ) 
through  (15).  The  minimization  of  the  overall  system  cost  function  in  Eq.  (3)  subject 
to  correlation  and  first  order  bandwidth  constraints  on  all  signals  and  detectors,  there- 
fore, involves  the  solution  of  a system  of  4M  coupled,  first  order,  linear  sum -differ  ence 
equations . 


In  view  of  the  fact  that  the  cost  functional  is  expressed  in  terms  of  various  cor- 
relation coefficients,  the  values  of  these  coefficients  must  be  specified  for  a unique 
solution  of  the  optimization  problem.  The  three  types  of  coefficients  may  be  considered 
to  be  the  elements  of  three  matrices;  a signal-detector  cross -correlation  (cost)  matrix 
A = fa  } , a signal  correlation  matrix  = f4<  1 and  a detector  correlation  matrix 

$ = While  the  matrix  A may  take  on  a number  of  forms,  two  forms  are  partic- 

ularly interesting:  those  leading  to  formulations  yielding  the  minimum  probability  of 
error,  P^,  and  those  which  maximize  the  probability  of  detection,  P^.  The  first  form 
of  the  A matrix  is  given  by  (see  Ref.  4,  Chapter  10) 


a =1-6 
mn  mn 


(16) 


where  6 is  the  Kronecker  delta,  and  the  other  by 
mn  ^ 


a = -6 
mn  mn 


(17) 


In  Eq.  (17)  the  negative  sign  arises  from  the  fact  that  P^  is  maximized.  In  terms  of 
the  matrices  of  Eqs.  (16)  and  (17),  the  optimal  equations,  Eqs.  (14)  and  (15),  reduce  to 
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(18) 
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for  maximum  of  P^.  It  should  be  noted  that  for  the  above  two  cases,  the  optimal  equa- 
tions differ  in  the  sense  that  in  the  minimum  of  P the  m-th  adjoint  variables  are  cross- 

e 

coupled  to  the  n-th  signal  or  correlation  function,  while  for  the  maximum  of  P^  no  such 
coupling  exists. 

Similar  to  the  matrix  A,  the  correlation  matrices  'I'  and  3>  may  assvume  a variety 
of  forms  some  of  which,  however,  may  lead  to  nonlinear  boundary  conditions.  In  the 
special  case  of  orthogonal  signals  and  detectors,  these  become 

= (22) 

mn  mn 


and 


d)  — <|>  6 
^mn  ^ mn 


(23) 


B.  Extension  to  Bandlimited  Channels 

The  formulation  of  the  previous  section  can  be  directly  extended  to  the  case  of  a 
general  bandlimited  channel.  However,  for  simplicity  of  the  presentation,  the  problem 
is  developed  in  detail  for  the  specific  case  of  a one-pole  channel  and  white  Gaussian 
noise  environment.  The  problem  is  formulated  for  the  maximum  of  P^  solution  in 
terms  of  a set  of  orthogonal  channel  output  signals.  The  solutions  are  considered  for 
equal  energy  input  signals  using  appropriate  sets  of  parameter  values  and  boundary 
conditions . 


For  a one-pole  channel,  the  A matrix  of  the  form  of  Eq.  (17)  and  correlation  ma- 
trices in  Eqs.  (22)  and  (23),  the  functional  of  Eq.  (2)  for 

K-1 


2 and  L = 0 is 
g 


M , 

. K-l 

m=  1 

o 

II 

K-2 

4A  V 

k=0 

j=l 


(24) 


where  ’^3j.j^(k)  are  the  channel  output  variables.  Defining  control 

variables  u (k)  = V (k),  v (k)  = g (k),  the  constraint  equations  are 
m ^m  m m 


X (k+  1)  = X (k)  + As  (k) 
sm  sm  m 


X (k+  1)  = X (k)  + Av  (k) 
gm  gm  m 

s (k+  1)  = s (k)  + Ax„  (k) 

m m 2m 


(25a) 


x^  (k+  1)  = aAs  (k+  1)  + e ^^x  (k) 

3m  m'  3m 


(25b) 


-1. 


J 
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where  a and  P are  the  channel  parameters  as  in  Reference  1. 

For  the  case  of  white  Gaussian  noise,  the  Hamiltonian  in  terms  of  Eqs.  (24)  and 
(25) is  given  by 

M ^ 7 7 BN  ^ 2 

H(k)  = I i^[2x3^v^(k)  - cj^x|^(k)  - C2„u^(k)  - — v^(k)]  + 
m=l 

+ + s^(k)]  + «t>[Xg^(k)  + vj^(k)]  + Pi„(k)[s^(k)  + Ax2^(k)]  + 

+ P2^(k+  l)[x2„(k)  + Au^(k)]  + P3^(k+  l)[aAs^(k)  + ^A^X2^(k)  + 

+ 

Maximizing  Eq.  (26)  with  respect  to  u^(k)  and  v^(k)  results  in 


v^(k)  = 
m 


■X3^(k)  . B>0 


u (k)  = p (k+1)  , C-  > 0 (28) 

It  can  be  shown  that  the  adjoint  equations  reduce  to  M identical  sets  of  coupled  linear 
difference  equations  with  constant  coefficients  of  the  form 


^(k+  1)  = s^(k)  + [l  + -3-^]x2m<‘^'  - 


2^  Plm^*"^  + 2^  P2m<‘^> 
2m  2m 


X3m(k+  1)  = aAs^(k)  + + e‘P^X3^(k) 

Plm(k+  1)  = -2  Al)j  s^(K)  - Q AW  e^'^X3^(k)  + Pjj„(k)  - aA 

P3m{k+1)  = WeP^X3m(k)  + eP^P3m(k) 


where 


W = 4A[2(()  - BNq]'^  . 


C.  Numerical  Examples 

As  a numerical  example  of  the  above  formulation  consider  the  solution  of  Eq.  (29) 
for  M = 4 (quaternary  system).  The  parameter  values  and  boundary  conditions  yielding 
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mutually  orthogonal  channel  output  signals,  i.e.  , those  satisfying  the  relationship 
K 

C =A  ^ X (k)g^(k)  = c6  . . i.j  = 1,2,  • ■ . ,M  (30) 

k=  0 ^ •’ 

where  6^^  is  the  Kronecker  delta,  are  given  in  Table  1.  The  solutions  are  illustrated 
in  Figs  2 and  3 for  equal  energy  input  signals  and  a transmission  interval  T = 2.  The 
last  two  parameter  values  are  chosen  to  permit  a comparison  of  system  performance 
with  a binary  system  on  the  basis  of  equal  data  transmission  rate.  From  the  figures, 
it  is  seen  that  the  direct  solution  of  the  problem  yields  a signalling  alphabet  with  a 
large  spread  in  frequency.  In  this  form  the  solution  is  not  useful  and  a modification  is 
needed. 

TABLE  I.  Parameter  values  and  boundary  conditions  for 

Quaternary  orthogonal  solution  of  Equations  (29). 


Solution 

4^ 

X2(k) 

Gabor  BW 

P(K) 

. 01 

1.0 

1. 20 

8.  34 

. 440 

®2 

. 01 

1.0 

-1. 46 

20.  85 

. 754 

"3 

.01 

9.4 

1. 45 

36.  05 

1. 098 

s . 

.01 

24.  0 

-1. 64 

56.  09 

1. 435 

4 

K=  30,  T=  2,  Eg  = 10,  and  all  other  constants  equal  to  unity. 


For  an  M-ary  signal  alphabet,  the  overall  Gabor  bandwidth  may  be  defined  either  as 
or 


= [J_  V 

'■  M Lj  ° IT,  ^ 
m=  I 


where 


? K-1  - 

=aV  [Vs  (k)]^  and  m=l,2,---,M. 
m , m 

k=0 

7 

Nuttall  and  Amoroso  have  shown  that  alphabets  of  equal  bandwidth  signals  which  mini- 
mize both  Eqs.  (31)  and  (32)  can  be  obtained  from  a set  of  harmonic  signals  by  means 
of  orthogonal  transformations  proportional  to  the  Hadamard  matrices.  It  is  known  that 
Hadamard  matrices  exist  for  M = 2,  4,  8,  1 2,  • • • , 200  with  few  exceptions  (see  Ref.  9, 
theorem  4.4).  For  M=  4,  there  are  two  noneqtiivalent  forms  of  the  orthogonal  matrix 
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given  by 


U 


1 


1 

1 

1 

1 

1 

-1 

1 

-1 

1 

1 

-1 

-1 

1 

-1 

-1 

1 

and 


^2  = 


^4 


1 

1 

1 

-1 


1 1-1 
1 -1  1 

-1  1 1 

1 1 1 


(33) 


(34) 


An  application  of  transformations  in  Eqs.  (33)  and  (34)  to  the  unequal  bandwidth  solution 
of  Eq.  (29)  results  in  two  nearly  equal  bandwidth  orthogonal  systems  with  reduced 
and  These  desirable  alphabets  and  detectors  are  presented  in  Figures  4 and  5, 
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A GENERALIZED  QUANTILE  DETECTOR 
J.I.  Cochrane  and  L.  Kurz. 

In  this  report,  a particularly  robust  subset  of  quantile  detectors  is  considered. 
Unlike  the  rank  detectors,  the  detectors  considered  here  are  easy  to  implement.  The 
knowledge  of  quantile  required  for  proper  operation  of  the  detector  can  be  obtained 
using  one  of  the  methods  suggested  by  Kersten  and  Kurz.^’ 

A.  Mathematical  Formulation  and  Asymptotic  Properties 

Consider  m i.i.d.  samples  from  the  distributions  Fj^,  i=  1,Z,‘  • • ,M.  Form  the 
empirical  distribution  functions 

n. 

1 J=1 

where  u(* ) is  the  unit  step  function  and  is  the  j-th  sample  from  F^.  Define  the 
weighted  sum  of  empirical  distribution  functions 

M 

i=l  1 

where 

M M 

Z Pi=  Pi>0,  Z "i  = N 
i=  1 i=  1 

Form  a class  of  tests 

m 

T.  = y C.  W.(X.)  U) 

1 P,  } 1 J 
J=1 

where 

0<X,  < <X  <1 

1 m 

and  C.  are  appropriately  selected  constants.  Tj^  of  Eq.  (1)  is  a general  representation 
for  a broad  class  of  threshold  tests.  Depending  on  the  choice  of  and  the  relative 
growth  rates  of  the  m,  Eq.  (1)  may  be  used  for  asymptotic  analysis  of  quantile  and 
joint  quantile  tests  with  random  and  fixed  partitions.  From  the  partition  viewpoint,  the 
ntimber  of  thresholds  m is  a measure  of  the  robustness  of  the  test.  Since  for  small  m 
the  tests  have  still  high  efficacy,  the  advantages  of  simplicity  and  robustness  of  the 
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The  maximum  efficacy  equals  the  largest  eigenvalue  of  B ^ and  the  optimum  scores 
of  the  test  are  the  corresponding  eigenvector.  Since  B is  of  rank  1,  Eq.  (7)  reduces  to 


^ -1 

<f(T.)  = F*  V F 


(8) 


where 


and 


(9) 


0=0 


B.  Numerical  Results 


To  illustrate  the  performance  of  the  T^^  statistics,  efficacy  calculations  have  been 
carried  out  for  the  Lehmann  alternative,  the  Gaussian  scale  and  shift  alternatives,  and 
the  Cauchy  scale  and  shift  alternatives.  The  Cauchy  alternatives  are  chosen  to  illus- 
trate the  performance  of  the  classifiers  in  an  extreme  case  where  the  moments  of  the 
distribution  do  not  exist.  The  properties  of  the  Lehmann  alternative  and  its  relation 
to  the  shift  alternative  is  discussed  elsewhere  in  this  report. 

In  Table  I,  the  efficacy  of  the  T^  statistic  for  M = 2 is  shown  for  increasing  values 
of  m where  the  parameter  (n^  + n^)/nj^n^  has  been  suppressed.  In  each  case  the  set  of 
generalized  quantiles  Xj,  • • • , X^  has  been  selected  to  maximize  the  efficacy.  The  last 
column  is  the  efficacy  of  the  locally  most  powerful  test  for  the  particular  case. 

TABLE  I.  Maximum  efficacy  of  T^  statistics. 


M 

Alt.\..^^^ 

1 

2 

3 

4 

5 

6 

e 

max 

Gaussian 

Shift 

.64 

. 810 

. 883 

.920 

.942 

.956 

1. 

Cauchy 

Shift 

. 405 

.429 

. 434 

.470 

.479 

. 483 

. 5 

Lehmann 

Alt. 

. 646 

.820 

. 891 

.927 

.948 

.961 

1. 

Gaussian 

Scale 

.61 

1.30 

-- 

1.64 

-- 

1.76 

2. 

Cauchy 

Scale 

. 143 

.405 

-- 

.461 



.479 

1.5 
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The  efficacy  of  the  best  Tj^  test  reaches  94%  of  the  locally  most  powerful  test  for 
m = 5 for  the  shift  and  Lehmann  alternatives,  and  82%  for  the  Gaussian  scale  alternatives 
for  m = 4. 

The  efficacy  of  the  T.  tests  for  M = 2 is  independent  of  parameters  and 
letting  P = 0 we  see  that  the  two- sample  quantile  test  has  the  same  efficacy  as  the  rank 
test  while  offering  increased  simplicity. 

Tables  II  and  III  show  the  optimum  \ and  C for  the  examples  calculated.  The 
scores  have  been  normalized  by  C^C  = 1.  Table  IV  is  included  to  demonstrate  the 
robustness  of  the  T^  tests.  In  this  table  the  test  designed  for  one  hypothesis  testing 
problem  (detection  problem)  has  been  applied  to  a different  noise  environment.  The 
resulting  loss  compared  to  the  best  T^  test  for  that  environment  is  tabulated. 

C.  Finite  Sample  Size  Analysis 

In  this  section,  the  exact  moments  of  W^(X)  = under  the  Lehmann 

alternative  (M=  2)  are  expressed  in  computationally  convenient  form.  Since  the  T^ 
statistics  are  linear  functionals  of  Wj^(X),  these  computations  are  useful  in  comparing 
finite  sample  size  and  asymptotic  behavior  of  the  tests. 

The  moments  of  Wj(X)  are  calculated  by  noting  that  it  can  be  expressed  as  a sum 
of  binary  random  variables 


Wj(X)  - n^‘  V “(^j)  *y(f{j))/ 

J=1 

where  x^^j  is  the  j-th  order  statistic  from  and  y^^j  is  the  j-th  order  statistic  from 
F2  and  f(*)  is  a nondecreasing  integer  function  of  j.  Since  f(-)  is  a nonincreasing  func- 
tion, the  joint  probabilities  which  appear  in  the  calculation  of  the  higher  moments 
simplify  to  allow  exact  computation.  If  ^ j ^ud  F^  are  absolutely  continuous,  it  can  be 
shown  that  the  moments  are 


e[w*;"(X)j=  A^(k)P^^p(k) 

k=  1 


where 


- (k-  1)"’] 
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TABLE  IV.  Loss  in  dB  for  three  environments. 


Optimum  Settings 

Optimum  Settings 

Optimum 

Settings 

for  Lehmann 

for  Gaussian  Shift 

for  Cauchy  Shift 

Alternatives 

Alternative 

Alternative 

Gaussian 

Cauchy 

Lehmann 

Cauchy 

Lehmann 

Gaus  sian 

M 

Noise 

Noise 

Alt. 

Noise 

Mt. 

Noise 

1 

1.11 

5.  65 

1.29 

0 

1. 29 

0 

2 

1. 04 

4.  74 

1.  15 

2.  55 

1.  89 

0.62 

3 

1.00 

4.  45 

1 . 08 

2.  74 

3.  36 

3.20 

4 

1.06 

4.  76 

1.02 

3. 24 

4.  47 

2.  99 

5 

0.  96 

4.  89 

0.99 

3.  36 

4.  47 

2.99 

6 

0.93 

4.  96 

0. 97 

3.  44 

4.  47 

3.  01 

n 1 n 1 

U*^)  - 1 j=k 

if  1 < f(k)  <n^, 

Otherwise 

P\  ^ ^ ”2 

= 0 if  f(k)  < 1 


B(  - , •)  is  the  tabulated  Beta  function  and  f(k)  is  the  smallest  integer  greater  than  or 
equal  to 


(X 


Pj(k-  1) 


(14) 
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THE  NYQUIST  PROBLEM  IN  THE  PRESENCE  OF  NON  LINEARITIES  IN  THE  DATA 
TRANSMISSION  SYSTEM 

L.  Kurz  and  G.  Soloway 

It  has  been  previously  established  that  the  introduction  of  a memoryless  nonlinear- 
ity between  the  channel  and  the  receiver  filter,  for  the  purpose  of  suppressing  impulsive 

1 . 

noise,  is  going  to  complicate  the  problem  of  intersymbol  interference.  It  is  no  longer 
possible  to  achieve  perfect  equalization  by  the  application  of  Nyquist's  criterion,  al- 
though for  most  reasonable  signals,  the  intersymbol  interference  will  be  bounded. 

In  this  report,  a frequency  domain  approach  to  equalization  in  data  transmission 
system  in  the  presence  of  memoryless  nonlinearity  is  considered,  with  particular  stress 
on  elimination  of  intersymbol  interference  caused  by  the  most  troublesome  terms. 

A.  A Frequency  Domain  Receiver  Design  Procedure 

Results  of  Ref.  1 suggest  that  the  linear  receiving  filter  (equalizer)  can  be  design- 
ed for  an  input  signal  which  is  a sum  of  the  linear  and  cubic  terms.  This  has  one 
major  drawback.  The  relative  amplitudes  of  the  linear  and  cubic  terms  are  not  invar- 
iant. They  are  a function  of  the  nonlinearity,  signal  ampEtude  and  noise  parameters. 

It  is  highly  undesirable  to  use  a design  procedure  which  would  change  as  the  variance 
of  the  noise  varies. 

Theorem  1 of  Ref.  1 suggests  a method  by  which  this  problem  may  be  alleviated. 
Let  the  desired  part  of  the  signal,  usually  the  linear  part,  have  a spectrum  G^(uj).  Let 
the  undes  red  part  of  the  spectrum,  perhaps  the  cubic  term,  have  a spectrum  Gg(u). 
From  Nyquist's  criterion  and  Theorem  1 of  Ref.  1,  the  following  constraint  equations 
can  be  written 

= Tx^  1^)1  < -^  U) 

k 

(2) 


(3) 
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i 


from  which  one  arrives  at 


R 


yM 


where  * denotes  complex  conjugate. 

Using  Eq.  (4)  in  Eqs.  (1)  and  (2)  yields 

]=  Tx^ 
k 

V{^1  [G^%)f  ^ = 0 


Solving  Eq.  (6)  for  v(cj) 


v(w)  = 


(4) 

(5) 

(6) 

(7) 


Using  Eq.  (7)  in  Eq. 


(5)  and  solving  for  X(u»)  yields 
Tx 


X (u>) 


k 


(a,)  ( ^ + 


[i 


(9) 


Similarly,  solving  for  X(w)  from  Eq.  (6)  and  using  this  in  Eq.  (5)  yields  the  following 
expression  for  v(u) 


v{u) 


E{[ 


T 


]i 


Tx 


2 1 p(k) 

- 2 


(10) 


t 

Substituting  from  Eqs.  (9)  and  (10)  in  Eq.  '4)  results  in  an  expression  for  the  desired 
filter 


3Z4 


COMMUNICATIONS 


r(k)  _ 


-[^ 


y 


-]g^(.)  CG^Mf ; 


Tx^[G^^)(u,)f 


zi' 


[■ 


y [Gj^>{c.)f  V) 


]1G^.)1^} 


(II) 


1 

It  is  possible  to  extend  this  result  to  include  more  terms.  The  number  of  terms 
for  which  intersymbol  interference  can  be  eliminated  depends  upon  the  bandwidth  avail- 
able. The  cubic  device  produces  power  terms  and  cross-product  terms  with  three 
times  the  original  signal  bandwidth.  This  extra  bandwidth  introduces  additional  degrees 
of  freedom  in  the  receiver  design,  allowing  for  additional  constraints  to  be  satisfied. 


B.  Examples 

An  example  will  help  clarify  the  use  of  the  filter  design  procedure.  As  in  standard 
PAM  systems,  the  signal  is  assumed  to  have  a frequency  response  corresponding  to 
the  square  root  of  a raised  cosine.  Here  the  excess  bandwidth  was  taken  to  be  . 5.  This 
frequency  response  and  that  of  its  corresponding  time  response  cubed  are  shown  in 
Figure  1.  Equation  (11)  may  now  be  solved  for  the  desired  filter  frequency  response. 
This  filter  characteristic  is  shown  in  Figure  1.  The  time  responses  which  result  from 
passing  the  linear  and  cubic  terms  through  this  filter  are  shown  in  Figure  2. 

It  may  easily  be  seen  that  the  design  satisfies  all  the  constraints.  However,  this 
result  has  not  been  achieved  without  penalty.  The  time  responses  are  not  well  damped. 
This  will  increase  jitter  problems.  Also,  a glance  at  the  filter  response  reveals  that 
this  filter  permits  far  greater  amount  of  noise  to  pass  through.  Therefore,  the  system 
designer  must  make  a choice  whether  it  is  more  important  to  eliminate  the  distortion 
due  to  the  cubic  term,  or  more  effectively  suppress  the  noise. 

An  alternative  would  be  to  control  the  intersymbol  interference  due  to  the  cubic 
term  while  not  reducing  the  intrasymbol  interference  to  zero.  To  accomplish  this  the 
right  hand  side  of  constraining  Eq.  (2)  would  be  set  equal  to  a constant,  say  Tx^. 
Following  the  same  procedure  which  resulted  in  Eq.  (11),  one  obtains 


'' 


cubic  terms . 
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= 


{t-o 


rZ[G?MfGg)(<0) 


]gJ^‘^V)[g^*"W} 


i 


lG^’(w)l^CG^‘‘’(.i)f 


poWMp 


] Ml'} 


The  design  problem  considered  in  the  first  part  of  this  section  is  repeated  using  Ekjua- 
tion  (12).  The  filter  frequency  response  is  shown  in  Figure  3.  It  is  seen  that  this 
represents  a far  more  practical  solution.  This  filter  will  have  superior  noise  perform- 
ance compared  to  that  of  Figure  1.  Furthermore,  the  time  responses  are  well  damped 
(see  Figure  4).  Its  only  drawback  is  that  a small  intrasymbol  interference  term  re- 
mains which  is  a function  of  the  nonlinearity,  the  noise,  and  the  signal  amplitude. 
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EDGE  DETECTION  USING  ANOVA  TECHNIQUES  IN  CONJUNCTION  WITH  QUANTILE 

STATISTICS 

L.  Kurz  and  P.  Legakis 

In  this  report,  analysis  of  variance  (ANOVA)  techniques  are  employed  for  the 
solution  of  the  edge  detection  problem.  Although  ANOVA  techniques  have  been  applied 
to  edge  detection  problems  before, ^ the  procedures  presented  here  take  full  advantage 
of  the  data  reduction  based  on  estimated  quantiles.  Specifically,  techniques  based  on 
the  one-way  analysis  of  variance  based  on  quantile  estimates  are  used.  The  quantile 
estimation  procedures  yield  asymptotically  normal  estimates  regardless  of  the  dis- 
tribution of  the  observables.  In  addition,  by  properly  selecting  the  quantiles,  the  much 
simpler  fixed  effects  model  of  analysis  of  variance  can  be  applied  to  gray  level  and 
texture  edge  detection. 


A.  Statement  of  the  Analysis  of  Variance  Problem 

Let  Q be  an  L xL  array  of  quantiles  resulting  from  an  array  H of  image  inten- 
sities  after  some  form  of  quantile  estimation  on  H.  Let  q^^  be  the  quantile  estimate 
based  on  the  sample  ^r,  s + n/2 

intensities  H.  Assuming  a rectangular  scan  on  the  original  data,  the  order  that  the 
quantiles  are  transmitted  by  the  estimator  is 


Sli*  ‘liz’ 


SiL  ' ^21’ 


l2L  ’ 


‘L  L 

t-n  n 


(1) 


Each  of  the  elements  q^^  may  be  thought  of  as  a random  variable  with  mean  and 
variance  Furthermore,  the  random  variables  q^^  are^  asymptotically  normally^ 

distributed^egardless  of  whether  stochastic  approximation  ’ or  the  order  statistic 
approach  is  used  in  their  estimation.  The  means  have  different  meaning  depending  on 
what  the  quantiles  are.  For  example,  if  {q^g}  estimate  the  50th  percentile,  is  the  mean 
due  to  the  presence  of  gray  level  at  the  point  (r,  s)  and  0^^  is  the  variance  which  accounts 
for  zero  mean  additive  noise  due  to  finite-scanner  bandwidth,  image  quantization, 
transmission,  and  image  texture.  If  q^^  is  a high  quantile,  e.g.  , the  95th  percentile, 
the  fi  is  the  average  change  in  texture  of  the  point  (r,s)  and  the  variance  has  the  same 
significance  as  for  the  case  where  q^^  is  the  50th  percentile  estimate.  Thus,  edges 
are  defined  in  terms  of  the  values  of  q^^  at  either  side  of  the  edge.  For  gray  level 
edges,  the  quantile  estimates  are  medians  and  the  mean  of  the  medians  changes  across 
the  two  adjacent  gray  levels  while  the  variance  stays  the  same.  For  texture  edges,  the 
quantile  estimates  are  high  percentiles  and  the  mean  of  the  percentiles  changes  when 
actual  changes  in  texture  are  encountered.  In  either  case,  one  is  looking  for  a shift  in 
the  mean  of  either  the  50th  or  the  high  (low)  percentiles. 
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B.  Edge  Detection  Using  One-Way  ANOVA 

For  each  point  (r,s)  of  the  x quantile  array  Q,  a neighborhood  is  defined 
consisting  of  a horizontal  array  of  M groups,  each  group  consisting  of  N points.  The 
whole  neighborhood  is  centered  around  the  point  (r,  s),  where  r = 1, 2,  3,  • • • , and 
s = k,  k+ 1 , • • • , - k.  (k=  (MxN);2  = Largest  integer  greater  or  equal  to  (M  xN)  : 2) 

Expressing  by  the  array  representing  the  quantile  neighborhood  of  (r,  s),  then 

X consists  of  elements  f X.,]  which  can  be  written  in  terms  of  the  quantile  elements 
rs  '•rs  ij  ^ 

as  follows; 

rs^ij  *^r ; s - k + N(i  - 1)  + j 

i=  1,2,  • • • , ji;  j=  1,2,  • • • , N (2) 

r = 1,2,  • • • , L ; s = k,  k+  1,  • • • , L - k 
m n 

If  for  any  integer  N the  notation 

~N=  1,2,3, --  ^N  (3) 

is  adopted,  then  a computationally  simpler  representation  of  results,  namely, 

rs^  ” *^r;s  - k + a(M  xN) 

(4) 

r=l,2,-*-,L  ;s  = k,  k+l,‘**,L  -k 
m n 

Note  that  the  expression  for  the  quantiles  has  one  row  index  and  MxN  column  indices 
as  required  for  the  formulation  of  Also,  the  reason  why  s starts  from  k is  that 

one  wants  to  form  neighborhoods  that  are  centered  around  the  points  of  interest  (r,  s) 
and  thus,  by  necessity,  the  first  and  last  (k  - 1)  columns  of  the  quantile  array  Q are 
omitted  so  far  as  the  classification  of  those  particular  points  is  concerned.  They  are, 
however,  taken  into  account  beginning  with  column  k if  the  neighborhood  includes  them. 
The  exclusion  of  the  first  and  last  (k  - 1)  columns  does  not  constitute  any  serious  draw- 
back for  the  procedure  since  the  area  of  interest  usually  occupies  the  center  of  the 
picture  and  the  scanner  resolution  is  such  that  k is  much  smaller  than  the  distance 
between  successive  edges. 

The  array  ^^X  can  be  thought  of  as  a mask  that  is  moved  over  points  (r,  s)  for  a 
total  of  L^x(L^  - 2k)  times  for  the  complete  coverage  of  the  quantile  array  Q.  The 
masking  process  together  with  the  application  of  one-way  ANOVA  techniques  will  result 
in  L^  statistics  which  give  an  edge  enhanced  representation  of  the  original 

picture. 

If  an  n:l  data  reduction  ratio  is  asstuned  from  the  array  of  observations  to  the 
array  of  quantiles,  then  for  fixed  image  size  and  variable  scanner  resolution  (N^,  N^) 
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on  the  original  data,  the  minimum  distance  between  successive  edges  must  be  a large 

multiple  of  MxNxn  picture  elements  in  order  to  avoid  multiple  false  edge  detections 

and  significant  performance  degradation.  This  requirement  also  helps  in  the  picture 

left-  and  right-margin  problem  because  if  N » M xN  xn  > (M  xN  xn)  +2,  the  border 

points  become  comparatively  insignificant.  Notice  that  one-way  ANOVA  formulation 

does  not  impose  any  requirements  on  N , the  vertical  scanner  resolution,  because  the 

r 

operation  proceeds  along  rows  rather  than  columns.  For  the  formation  of  neighborhoods 
that  have  vertical  and  diagonal,  in  addition  to  horizontal  symmetry,  one  can  still  use 
one-way  ANOVA  with  two  treatments  provided  that  a reference  sample  is  available. 

The  first  treatment  consists  of  the  points  of  the  reference  sample  which  is  of  known 
classification  and  the  second  treatment  consists  of  all  the  neighborhood  points  of  (r,  s) 
as  a group.  In  the  latter  case,  where  a reference  sample  is  available,  one  is  actually 
performing  pattern  as  opposed  to  edge  detection,  and  this  procedure  will  give  much 
better  results  than  direct  either  one-  or  two-way  ANOVA.  This  should  come  as  no 
surprise  for  two  reasons.  First,  the  choice  of  the  neighborhood  is  more  representative 
of  the  environment  of  (r,s)  and,  second,  the  availability  of  a reference  sample  yields 
additional  knowledge  that  any  good  detector  should  take  advantage  of.  Two-dimensional 
neighborhoods,  in  addition  to  the  requirements  on  the  scanner  resolution  N^  in  the 
horizontal  or  x-direction,  impose  constraints  on  the  scanner  resolution  N^  along  the 
vertical  or  y-direction,  N^»M1,  where  Ml  is  the  number  of  rows  that  are  used  in  the 
formation  of  the  neighborhood  of  (r,  s).  Notice  that  the  data  reduction  factor  n does  not 
impose  any  constraints  on  the  vertical  reduction  and  this  is  because  quantile  estimation 
takes  place  along  the  x-direction  only.  Large  scanner  resolutions  are  desirable  because 
they  provide  much  more  information  about  the  scanned  image,  but  they  must  be  moderate 
enough  to  make  the  storage  problem  and  the  I/O  activity  realistic.  This  excludes  the 
consideration  of  the  processing  time  of  the  detection  procedure. 

So  far,  scanner  resolution  reqxiirements  and  the  general  form  of  have  been 
discussed,  yet  the  problem  of  selection  of  the  number  of  groups  of  observations  M and 
the  number  of  observations  N within  each  group  has  been  excluded.  As  long  as  the 
neighborhood  parameters  M and  N are  greater  than  unity,  can  be  used  as  a possible 

array  in  the  one-way  ANOVA  scheme.  Increasing  the  product  of  these  parameters, 
decreases  the  variance  of  ^^f  and  has  a beneficial  effect  on  the  significance  and  power. 
However,  if  the  product  is  made  too  large,  one  is  running  the  risk  of  multiple  detections. 
The  best  policy  is  to  make  the  product  MxN  as  large  as  possible  subject  to  the  require- 
ment that  it  should  also  be  much  smaller  than  the  distance  between  successive  edges. 
Since  no  feasible  analytical  procedure  for  selecting  optimum  values  of  M and  N has  been 
found,  the  problem  is  easily  resolved  via  simulation. 
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In  the  application  of  the  one-way  ANOVA  techniques  for  edge  detection,  arrays 
are  formed  as  indicated  above,  and  the  problem  is  posed  as  in  a hypothesis  testing 
framework  as  follows: 


H:  groups  are  homogeneous  -»  No  edge 
K:  groups  are  heterogeneous  Edge 

The  pertinent  statistics  for  the  point  (r,  s)  can  be  obtained  through  iteration  by  varying 
r and  s over  their  respective  ranges. 


N 


rs 


^i.  N *^r;  s - 


j=l 

M 


k + N(i  - 1)  + j 
M N 


~ - i V ~ t y y 

rs^'  ■ M — ' rs^i.  ” MN  —i  u *^r;s-k  + N(i-l)+i 

i=lj=l 


i=l 


- M N 

= N y y ( X - X.  . 
rs  a u U 'rs  i.  rs 

i=l  j=l 

M N 


s®e  S S ^*^r ; s - k+ N(i  - 1)  + j ' rs*i.  ^ 
i=  1 j=  1 


2 2 2 
rs  T rs  a rs  e 


rs  a 

(M  - 1)  _M(N-l)rs  a 
rs  (M  - 0 g2 

rs  e rs  e 


lvI(N  - 1) 


(5) 

(6) 

(7) 

(8) 

(9) 

(10) 


Since  the  quantiles  are  asymptotically  normally  distributed 


rs^  “ rs^(M  - 1),  M(N  - 1) 
rs^  “ rs^(M  - 1),  M(N  - 1)^®  ^ 


(11) 

(12) 


where  M(N  1)  denotes  a central  F and  F ^ ^(8  ) a non-central  F dis- 

tribution with  (M  - 1)  degrees  of  freedom  in  the  ntimerator  and  Mx(N  - 1)  degrees  of 
freedom  in  the  denominator.  If  the  groups  are  heterogeneous,  a vertical  edge  exists 
in  the  neighborhood  of  the  point  (r,s)  which  causes  the  group  effects  {a^l  to  increase. 
This  increase  causes  a shift  in  the  mean  of  j,gE  and  the  hypothesis  H is  rejected  at  a 
level  a.  However,  since  asymptotic  normality  conditions  may  require  an  excessively 
large  sample,  the  statistic  ^^f  can  still  be  used  and  the  decision  is  made  as  follows: 
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> T -»  Edge  at  (r,  s) 

£=■(  ° (13) 

-*  No  edge  at  (r,s) 

In  this  case,  not  only  M and  N must  be  properly  selected,  but  also  the  threshold  value 
in  such  a way  that  the  total  probability  of  error  is  minimized.  This  clearly  requires 
a known  sample  image  within  the  class  under  consideration  to  properly  select  M,  N and 
T^  through  simulation. 

The  advantage  of  using  quantile  and  ANOVA  techniques  over  simple  ANOVA  tech- 
niques applied  to  the  original  data  for  the  solution  of  edge  detection  problems  is  dem- 
onstrated in  Figures  1 to  5.  Figure  1 represents  the  uncorrupted  reference  pattern. 

To  demonstrate  the  effectiveness  of  the  quantile -ANOVA  procedures,  the  pattern  was 
corrupted  with  uniform  or  gaussian  noise.  Figures  2 to  5 demonstrate  the  advantages 
of  the  new  quantile -ANOVA  procedures. 

Texture  edge  detection  with  Gaussian  N(0,  . 5)  and  N(0,  1)  noise  simulation  is 
demonstrated  in  Figures  6 and  7.  Here,  again,  quantile -ANOVA  techniques  perform 
very  well  while  the  simple  ANOVA  approach  fails  to  give  an  outline  resembling  the 
original  pattern. 

One  of  the  disadvantages  of  one-way  ANOVA  is  that  it  cannot  detect  perfectly  hori 
zontal  edges.  This  however  can  be  overcome  by  one-way  ANOVA  with  a reference 
sample  available,  or  simply  by  rotating  the  pattern  by  90  degrees  and  applying  one-way 
ANOVA  to  it.  The  processing  time  for  quantile  ANOVA  techniques  was  about  one  third 
of  the  direct  ANOVA  techniques  and  in  a multi-processing  environment  the  ratio  should 
improve  in  favor  of  quantile  techniques. 

The  detector  parameters  for  one-way  ANOVA,  with  and  without  quantile  reduction 
vas  picked  empirically.  In  particular,  M = 2 groups  were  chosen  to  limit  the  neighbor- 
hood size  and  N was  allowed  to  vary  from  5 to  15.  Best  results  were  obtained  for  M = 2 
and  N=  8.  The  threshold  T was  also  chosen  empirically  and  it  was  very  close  to  the 
value  corresponding  to  a = . 01  if  normal  distribution  is  assumed.  The  simulations  dem- 
onstrate that  quantile -ANOVA  techniques  perform  much  better  than  simple  ANOVA 
techniques  for  the  solution  of  gray  level  or  texture  edge  detection.  This  is  especially 
true  if  the  image  is  severely  corrupted. 
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Fig.  7.  Texture  edge  detection  using  one-way  ANOVA 
after  quantile  reduction.  Reference  pattern 
corrupted  as  in  Figure  6. 
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GRAY  LEVEL  DETECTION  USING  M -INTERVAL  OCCUPANCY  VECTOR  IN 
CONJUNCTION  WITH  MANN -WHITNEY  STATISTIC 

L.  Kurz  and  P.  Legakis 

In  this  report,  two  statistics  of  the  two-  and  three-sample  variety,  which  are  mod- 
ified versions  of  the  Mann-Whitney  statistic  and  belong  to  the  general  class  of  mixed 

statistics,  are  introduced.  Unlike  in  conventional  Mann-Whitney  statistics  of  the  two- 

1 2 

and  three-sample  type,  ' where  the  processing  is  performed  on  the  raw  data,  the  new 

mixed  statistic  uses  as  observables  the  occupancy  vector  of  the  m -interval  tests.  The 

new  statistics  preserve  the  good  qualities  of  both  types  of  tests;  power  of  the  standard 

Mann-Whitney  tests  and  the  robustness  of  the  m -interval  tests.  The  mixed  tests  differ 

from  the  original  Mann-Whitney  statistic  in  that  they  require  considerably  less  process - 
2 2 

ing  time  (0(m  ) vs.  0(n  )),  n » m operations)  and  in  that  they  can  be  used  in  shift-in- 
the-mean  and  c hange -of -scale  problems,  while  the  standard  Matin -W hitney  tests  on  the 
original  observations  perform  well  only  in  the  shift-of-the -mean  problems. 

A.  A Two-Sample  Quantile  Mann-Whitney  Classifier 

The  availability  of  a reference  sample  of  n independent  and  identically  distributed 

(i.i.d.)  observations  under  the  hypothesis  with  distribution,  F(x),  is  assumed.  (The 

influence  of  dependence  on  the  performance  of  classifiers  discussed  in  this  cahpter, 

may  be  studied  using  the  approach  suggested  by  Woinsky  and  Kurz."^)  This  reference 

sample  is  denoted  by  Y^  = Y ' ' ' >7^'  The  received  sample  of  n i.i.d.  observations 

to  be  c las  sified  is  denoted  by  X = x , , x., , • • • , x . F rom  the  reference  sample  a vector 
_ n 1 z n 

a = (a^,  a^ , ■ ■ • , a^)  corresponding  to  the  m -interval  quantile  partitioning  of  F(x)  is 
formed  such  that 

Prob[a._j  <y.  < a.]  = T - FY(a._p  = p. 

i = 1 , 2,  . . . , m;  j = 1 , 2,  • • • , n (1) 

a > -OC;  a <00;  p.  = -L 
o m ‘^1  m 

Let  n.  be  a set  of  numbers  defined  as 
1 

n.  = Card  fy.  : a.  , < y.  < a.  ] 

' ' J"  ‘ (2) 

i=l,2,  •■,m;  j=l,2,---,n 

The  classification  problem  is  to  decide  whether  X and  Y come  from  the  same 

n n 

dj  .tribution  or  not.  With  a.  as  defined  by  Eq.  (1),  a new  sequence  f.  is  generated  such 
• ' t 


f 
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1.  = Card  fx.  : a.  , < x.  < a. 

1 I 1-1  - J 1 (3) 

i = j = 

The  Mann-Whitney  statistic,  U,  based  on  and  the  occupancy  vectors,  is  now 
generated  by 


m m 

U = Y ^ u(n.  - 1.)  (4) 

2 -I  -I  ' 1 j' 

m i=l  j=i 

where 

1 if  n.  > f . 
f 1 - J 

u(-)=l  (5) 

0 if  n.  < i. 

1 J 


The  U statistic  counts  the  number  of  times  n^^  is  greater  than  f . and  on  the  basis  of  that 

count  the  classifier  makes  a decision  as  to  whether  X comes  from  the  distribution, 

n 

F(x),  under  the  hypothesis,  or  the  distribution,  G(x),  under  the  alternative.  This  for- 
mulation is  distribution-free  not  only  because  quantile  information  which  may  be  obtain- 
ed from  the  data  rather  than  exact  knowledge  of  the  appropriate  distribution  functions 
is  involved  but  also  because  the  U statistic  is  based  on  the  relative  ordering  of  the  n. 

and  1.  and  not  on  any  known  distributions. 

J 

B.  Distribution  of  the  Two-Sample  Quantile  Mann-Whitney  Statistic  Under  the 
Hypothesis  (H)  and  the  Alternative  (Kj" 

Mann  and  Whitney^  have  shown  that  U is  asymptotically  normally  distributed  under 

both  the  hypothesis  and  the  alternative,  provided  that  the  n.  and  1.  are  i.i.d.r.v.  It  is 

6 ^ ^ 
known  that  the  n.  are  multinomially  distributed 
1 


u(n 


l’'^2’ 


■>"m^  = 


TT 

i=I 


n. 

1 


n. 

1 


subject  to  the  constraints 

m m 

V n.  = n and  ^ p.  = n 

-/I  -I 

i=  1 i=  1 

Also 

E[n^]  = np^;  Var[m]  = np^(l  -p^);  Cov(n^,m)  = -n^P^Pj 

Extensive  simulation  studies  under  the  same  signal  conditions  and  using  Kendall's  t 
coefficient  for  independence,  show  that  this  dependence  is  very  weak.  In  fact,  simulation 
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results  show  that  the  are  independent  at  least  at  the  a = . 01  level  and,  therefore,  the 
moments  of  U in  the  form  derived  by  Mann-Whitney  can  be  used.  The  first  and  second 
moments  of  the  U-statistic  are  given  by 

Under  H: 

Eh[U]  = .5 


Varj^[U]  = 0 


U/H-  j 


2 m 


^ (2m  + I) 


Under  K: 

Ek[U]  = .5  - X 


Varj^[U]=  —A 


12  m m 


(2m+  1)  + - (2  X - t j - e^) 


where 


gu 

X = A-  - / G(x)dF(x) 

-00 

00 

Ej  = j - / G^x)dF(x) 


oo 

= j-f  (1  -F(x))^dG(x) 


2 3 


Since  the  first  and  second  moments  of  a normal  distribution  completely  define  the  dis- 
tribution and  since  the  Mann-Whitney  statistic  is  asymptotically  normal,  the  asymptotic 
distributions  under  the  hypothesis  and  the  alternative  are  known. 

Under  uniform  noise  conditions  and  shift-in-the-mean  alternatives  the  first  and 
second  order  moments  of  U can  be  easily  specified  by  calculating  X,  e ^ and  e Namely 

Under  H: 

F(x)  = G(x)  = xu(x)  - (x  - 1)  u(x  - 1) 

Under  K: 


and 


F(x)  = xu(x)  - (x  - 1)  u(x  - 1) 


G(x)  = F(x  - c)  = (x  - c)  u(x  - c)  - (x  - c - 1)  u(x  - c - 1) 
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where  u(x)  is  the  unit  step  function  and  c is  a constant  shift  parameter.  Thus 
°°  1 

^ - 2 ' f = (x-c)dx  = y(l-(l-c)^) 

-00  ^ C ^ 

1 . 2 1 ■> 

El  = T-J  G%)dF(x)  = i - (1  -(1  -0)"^) 

-00  ^ 

1 OC  - 2 

E2=3-J  dG(x)+2/  F(x)dG(x)  - / F(x)^  dC(x)  = ( ^ - c + l)c 

Direct  substitution  of  the  expression  for  X,  e ^ and  gives  the  expression  for  the  mo- 
ments of  U in  terms  of  the  partition  size  and  the  shift  parameter  c. 

C.  The  Two-Sample  Quantile  Classification  Procedure 

The  two-sample  quantile  classification  procedure  can  be  formulated  as  follows: 

(1)  From  the  reference  sample  Y compute  the  quantile  vector  a=  (a  ,a  , • • • a ) 
which  partitions  the  probability  space  under  the  hypothesis  into  m equiprobable 
segments 

(2)  From  the  same  reference  sample  compute  the  n.  which  denote  how  many  ob- 
servations from  the  reference  sample  fall  into  each  of  the  cells  defined  by  a^ 

(3)  From  the  sample  to  be  classified  compute  the  which  denote  how  many  ob- 
servations from  X fell  into  each  of  the  cells  denoted  by  & 

(4)  On  the  basis  of  n^^  and  f ^ compute  the  U-statistic  as  indicated  by  Equation  (4) 

(5)  Compare  the  value  of  U with  the  threshold  which  minimizes  the  average 
probability  of  error.  (Other  criteria  for  selection  of  U such  as  Bayes, 
minimax,  etc.,  are  also  applicable. ) 

(6)  Decide  that  the  hypothesis  is  true  if  U > or  that  the  alternative  is  true  if 

U < U . 
o 

The  average  probability  of  incorrect  classification  or  average  probability  of  error,  P 

e 

is  given  by 

Pe=“PH-^PPK  (6) 

where 

Ppj  a priori  probability  of  H 
Pj^  a priori  probability  of  K 
Q the  type -I  error 

P the  type -II  error 
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Equation  (6)  may  be  written  explicitly  as 

fj^(u)du  (7) 

Since  the  a priori  probabilities  p^j  and  Pj^  are  not  known,  it  is  customary  to  assume  that 
they  are  equally  likely 

Ph  ^ Pk  “ 2 

It  can  be  shown  that  the  optimum  threshold  is 

„ -B  ± - 4AC 

^o  = ZS 

where 

^ ^ °U/H  ■ ^ U/K 
B = ■ ^U/K  ^U/h} 

The  proper  choice  between  the  two  values  of  is  obvious. 

D.  Three-Sample  Quantile  Mann-Whitney  Statistic 

It  is  intuitively  appealing  that  if  in  addition  to  the  reference  sample  which  is  taken 
from  the  distribution  under  the  hypothesis,  a sample  taken  from  the  distribution  under 
the  alternative  was  available  at  the  classifier,  a new  test  statistic  could  be  formulated 
which  would  take  advantage  of  the  additional  information  resulting  in  improved  perform- 
ance of  the  classifier.  The  three-sample  quantile  classification  problem  can  be  stated 
as  follows. 

Let  Y = (y.  .y,,  • • • ,y  ) and  Z = (z , , z , • • • , z„)  be  two  vectors  consisting  of  n 
nifc  n ni^  n 

i.i.d.r.v.  , each,  and  taken  from  a hypothesis  distribution  H and  an  alternative  distri- 
bution K,  respectively,  i.  e.  , Y^e  ^n®  ^ ” ^^o'  ’ ' ' '^m^ 

"0*=  (a  , • • • , Q ) be  the  corresponding  m -interval  equiprobable  quantile  partitionings 
under  the  hypothesis  and  the  alternative  distributions  respectively.  Namely, 

Prob[a.  _j  Yj  1 = Pi 

Prob[a^_j  < z^  < Q.  ] = p^ 

i—  1,2,***, m I l,2,**',n 


Pe^Pni 

.00  U 


r 
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m 


a , a > 
o o 


a , a < +00 
m m 


L«t 


,x  ) be  a vector  of  observations  that  need  be  classified  as  coming 
- - n' 

either  from  H or  K,  and  letT*=  (f  i^i 
such  that 


,i  ) and  w = (w,  , 
m 1 


, w ) be  two  vectors 
m' 


= Card  {Xj 

= Card  {x^  ; 1 ^ '^i  ^ 

i = 1,2,  ••  • ,m;  j = 1,2,  • • • ,n 

The  Mann -Whitney  quantile  statistic  can  now  be  formulated  as  follows: 


m m 


U = Z Z 

m i=ij=i 


where 


A J, 

u(f . - w ) = { 

^ •J  0 i 


if 

if 


J.  > w. 
1 - J 


jf.  < w. 
1 J 


Note  that  the  formulations  of  the  two-  and  three-sample  tests  are  similar  but  with  an 
important  difference.  While  in  the  two-sample  case  the  Mann-Whitney  statistic  measures 
some  kind  of  a relative  distance  of  the  observable  from  the  hypothesis,  in  the  three - 
sample  case  it  measures  the  relative  difference  between  the  hypothesis  and  the  alterna- 
tive causing  the  improvement  in  performance  of  the  three-versus  the  two-sample  case. 
Also,  if  the  formulation  of  the  two-sample  Mann-Whitney  statistic  based  on  observations 
X and  Y requires  0(n  ) operations,  the  formulation  of  the  three -sample  statistic 

n T1  ‘y 

requires  ZO(n)  operations.  Such,  however,  is  not  the  case  in  the  formulation  of  the 
two-  and  three-sample  quantile  Mann-Whitney  statistics  because  they  require  the  same 
number,  O(m^)  operations  once  the  quantile  occupancy  vectors  ? = (f  ^.^2'  ‘ ‘ »^rn^ 
w = (wj.w^i  • • • are  available,  and  in  usual  applications  m « n.  It  can  be  shown 

that  the  optimum  threshold  for  the  three-sample  test  is 

1 2 

U = 4-  + o„ /„  in  — 

o 2 U/H  Pj^ 

and  the  corresponding  probability  of  error  is 


4{ 


V 2ir  a 


U/H 
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■where  6 is  a small  signal  approximation  to  the  parameter 

- F^(?))dF  (?) 

-or  y ^ 


E.  Simulation  Results 

In  order  to  subjectively  evaluate  the  two-  and  three -sample  quantile  Mann -'Whitney 
statistics,  they  were  applied  on  a pattern  which  is  based  on  the  binary  pattern  of  Fig- 
ure 1.  The  symbols  and  'X'  are  used  to  indicate  binary  zeros  and  ones,  respective- 
ly. A pictorial  view  of  what  noise  of  a certain  type  does  to  a binary  pattern  is  given 
in  Figures  2 through  4.  For  example.  Fig.  3 results  from  Fig.  1 when  and  'X'  are 
replaced  by  observations  from  N(0,  1)  and  N(l.  1),  respectively,  and  a threshold  of  . 5 
is  applied.  To  indicate  how  well  the  two-sample  statistic  performs  under  the  same 
conditions  as  those  of  Fig.  3,  reference  is  made  to  Figure  5.  The  three -sample  statis- 
tic performs  considerably  better  (not  shown  because  it  looks  almost  like  the  reference 
pattern).  Figure  6 illustrates  the  performance  of  the  three -sample  statistic  under 
severe  [N(0,  1)  and  N(.  5,  1)1  noise  conditions.  Although  the  detected  pattern  is  not  as 
good  as  Fig.  1,  it  is  considerably  better  than  Figure  4.  Figure  7 illustrates  the  per- 
formance of  the  three-sample  statistic  under  change-of-scale  conditions.  It  is  to  be 
noted  that  the  original  Mann-Whitney  statistic  takes  considerably  longer  to  process  for 
approximately  the  same  performance  under  severe  noise  conditions  and  that  it  cannot 
be  applied  to  change-of-scale  problems. 
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NHKMMNIUIKXHMIIKXIUIMNKKJIXXKIIXNIIXNNKNKNNNMXNKNKNNNNIIMKNNHNKNKKNNNKN 

NNMNKNRKIlXIINiniMMinDIKNKIdllfNXKKKKNNICRHNIlKMNICNKIINfCNNNKKKNNNNNNNNKNN 

MMNNNNNNNNNNKNNNNNNNNiUlRNNKNNNMNNXXNNNKNXKKMMNNXXNNKKHKNKNKNKMMX 

NNKNMXXXXXXXXXXXKNXKNNHNNNNNMXNNNHMMKNXKNXKKNHHXXXXXXXKXNXHMMKMN 

NNNXNXXXXXXXXXNXlNNNNNNJfNXNXXNNNNNIIXNNNNNMNNNXNXXMNXXXXXXXNNNNNN 

KNNRNXXXXXXXXXXXXNXNRNKJlNNNNNNKNNNNNNirKNNKKXXNXXXXXNXXXNXXNNNNIlN 

NNKNNXXXXXXXXXXXXXNNNKMRNXRMMIINKNNKKNXKNNXMNMXXXXXXXXXXHXXNHKHHN 

KNMNNXXXXXXXXXXEXXXXXXNKMNNNNMNKXNKMNNKKNXXXXXXXXXXXXXXXKXHNMKMK 

KNXKMXXXNXXXXXXXXXXNXXXXNNNNMNNKNNKMNKXXXXNXXXXXXXXNXXXXXXNKKMHK 

KRNNXXNXXXXXXXXXXXXNXXXXXXXXXKXXNXXXXXXXXXXXXXXItMXXXXXXHXXNXNNKM 

HNNMNNXXXXXXXXXXKNXXXXXXXXXNXXNXXMXXXXXXXMXXXXXXXXXXXXXXXMHXNHKM 

NNNNNNXXXXXXltXXXXXXXXXXXXXNXXXXXNXXXXXXXXXXXXXXXXXXXKXXKXKKKKKNN 

NNKNNNKNXXXEXNXXXXNXXXXRXXXNXXXXXXXXXXXNXNXXXHMXXXXXXXXKNNKKKKNM 

MNXKMNNJIXXXXXXXMXXXNNXXXXXXXXXXXXXXXXXXXXXXXXXKXXXXXXXXXHMKNKKKM 

NIRKKRNNNXXXKXXXNXXXXXXXXXXXXXXXXXXXXXXXIfXXXXXXXXXXXXXNNXNNNNNNN 

NNNNKNNNMNXMXXXXXXXXXXXMXXXXXXXXXMXXXXXXNXXXXXXXXXXXXXNKNKKNNKMN 

NXNNNKNNXNXNXXXXXXXKKXXXXXXXXXXXXXXXXXXXXXXNXXXXNXXKXMKKMMHXHMKK 

NNNXEKRMNKXXXXXXXXXXXXXXXXXXXXXXKXXXXXXXXXXXXXEXXXXXXKMMNMMKKNNH 

NNNNNMMMHHNXMXXXNXXXXXNHXXXNXXXXXXXXXXXXXXXXXXXXXXXXKKNXNKNMHNHM 

NXNNMXRKMKKKXNXXXXKXXXXXKXXXXXXXXXXXXXXXXXXXXXXXXXXMNNNXNKNNMKNK 

KKNKNIUINNNICNXXXXXXXXXXXXXXXXXXXXXXXXICXXXXXXKXXXXXXXKHKKKXNKHKKXK 

NKKNNNNHKENXKHXXXXXXXXXEXXXXNXXXXXXXXXXXXXNXXXXXXXXHKXNMKKKKKKNN 

NNXXKRMKMKKMXXXXXNXXXXXXXXHXXXXKXXXXXXXXXKXXXXXXXXXNMHXNKNXMHKKX 

KKNRNUNXKXRXXXXXXXXXXXXXNXXXXXXXXXXXXKXXXXXXXXXXXMKKKKNNMKKNKNH 

XXRRKRKNKKlIMXXNXXXXXXXXXXXXXXXIIXXXXXXXXXXXXXXXXXXXXIIKirNNNNMirNNXX 

NNKEXKHHMHKNXXXXXXXMXXXXXXXNXXXXXXNXXXXXXXXXXXXXXXXMXNXMKKKNMKNH 

MNMMNNNIUCNNNXXXXXKXXXMXXXXXNXXXXXNXXXMXXXXXXXXKXXXXKNKNXXHKKKKKK 

NNJIRIIIMiaMNXXXXXXXXXXXXXXXXNXXXlIXXKNXXXXKXXXXXXXXXXXKKHKNMHXHNKH 

NNNMNKXHRNXXXXXXKXXXXXXXXXXHXMXXNXXXXXXXXXXXXXXXXXKXXKXHKKKKKHHH 

NKNNNXNNXXXXXKKXXXXXXXXXXXXXXXXXXXXXXXKXXXXXXXXKXXXXXXHMXNXXMHKK 

NNNKNKNNXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXMXXKKMKMMKXKN 

KNNXHKNKNNXXXXXXXXXXXXXXXXXXXXXNXXXXXXXXXXXHKXXXXXXXXXfririrKKKKXXX 

xniwnxkxxxnxxxxxxxnxxxxxxxxxxxxxxxxxxxrxxxxxxxxxxxxxkxxxhnkmmkxk 

nnnxkknmxxxxxxxxnxxxxxxxxxxxxxnxnxxxxkxxxxxnxxxxxxxxxxxehnkknhnk 

NNKXNNKXXXXXXXXXXXXXKXXICXXXXXXXXXKXXXXXXXXXXXXXXXXXXXXXXXNNNNMXN 

NNMHMNMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXNMNKXKE 

NXNNKXXNXXXKXXXXXXXNXXXXXNXXEXXXXXXXMXXXXXXXXKXXXXXKNXXXXMKNKXMM 

NNNNNXXXXXXXXXXXXXXXXXXXXNNNMNNNNNNNNMXXXXXXXNXXKXXXXXXXXXNXXMMK 

MNNNMXXXMXXXXXKXXXXXXXKIIMXNNKNNRNNIIKNNXHXXXXXXXXKKXXXXXXXXMNKHKM 

HXNKKXXXXXXMXNXXXNNHKNXKHXKXMXKMNKmiNNNNKNKKKNKXXXXXXXXXXXKKKMKK 

nknmkxxxxxxxxxxxxkkhhkhnmhhnkkknxknhhknhhmmhhhxxxkxxxxxxxxnhhkmm 

hhkkkxxxxxxxxxxxknmkmnmnkxkkkkmnkkmmkkmmhnhknhkxxxxxxxxxxxnnkhmh 

HNHNNXXXXXXXXXXXMNMEHMHKENNNHXKMKHMKKMKHKXHNXXKXXXXKXXXNXXKKKKKK 

XKNNNNKNNKKMNNNNXNXKMNNKMMXMKEMXMKKKMNKNHXMNKHMXHMNNMMNMMMKMKKKM 

KMKKKNXKNXIUfMMNKMNNMKKXKXXKEKKKKNXNKNKKXKMEMNXMNNNNHNNMNKHNXMNNN 

NXNNKKMNNNNNXKNKMHNNNNKMMXKMNKXNKMXKKNNKKKKKKKKNNKMKHMXXXHKKKKMK 


Fig,  7.  Gray  level  detection  using  the  three-sanaple  Mann-Whitney 
statistic  based  on  quantile  occupancy  vectors.  Reference 
pattern  corrupted  with  gaussian  N (0,  1)  and  N (,  5,  1)  noise. 
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DETERIORATION  IN  PERFORMANCE  OF  A DUOBINARY  PAM  SYSTEM  BY 
INTRODUCTION  OF  NONUNEARITIES 

L.  Kurz  and  M.  Wernicki 

In  this  report,  the  effect  of  memoryless  nonlinearities  on  intersymbol  interference 
in  a duobinary  PAM  system  is  considered.  As  was  shown  in  Refs.  1 and  2,  nonlineari- 
ties are  introduced  in  a PAM  system  to  improve  overall  performance  if  the  sources  of 
degradation  include  not  only  intersyrnbol  interference  and  gaussian  noise  but  also  im- 
pulsive noise.  It  was  demonstrated  that  the  influence  of  nonlinearities  natural  to  each 
system  or  introduced  to  suppress  impulsive  noise  creates  problems  in  controlling  inter - 
symbol  interference. 

The  analysis  in  Refs.  1 and  2 was  concentrated  on  the  third  order  polynomial  non- 
linearity as  a best  representative  of  the  general  class  of  nonlinearities  which  approxi- 
mate natural  or  near-optimum  for  impulsive  noise  nonlinearities.  This  model  is  used 

here,  too.  The  analysis  in  these  references  which  concentrated  on  standard  PAM 

3 4 

systems  is  extended  to  PAM  systems  with  duobinary  signalling.  ’ Using  the  prolate 
spheroidal  wave  function  (PSWF)  in  a series  expansion,  bounds  are  found  on  intersymbol 
interference  introduced  by  the  nonlinearity.  Related  problems  of  elimination  of  inter- 
symbol interference  caused  by  the  nonlinearity  are  also  considered.  The  importance 
of  the  duobinary  system,  which  has  the  rate  capability  of  a quaternary  system  but  re- 
quires only  slightly  more  complicated  circuitry  than  a standard  binary  PAM  system, 
makes  the  study  of  it  particularly  useful. 

A.  Bound  on  the  Intersymbol  Interference  Caused  by  the  Third  Power  Nonlinearity  -- 
PSWF  Series  Approach 

The  duobinary  signal  is  of  the  form 

(1) 


H<-^  (2) 

otherwise 

where  the  notation  of  Ref.  4 has  been  used.  The  transfer  functions  of  the  correspond- 
ing transmitting  and  receiving  filters  are 

G.J.M  = Gj^(w)  = (2T  cos  , |u>l<^  (3) 


s(t)  = 


4 cos  TT  Y 


1-4 


with  Fourier  transform 
S(iti)  = 2T  cos(^(»)) 
= 0 


0 


otherwise 
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At  the  output  of  the  nonlinearity  f(x)  = + a^x  with  proper  truncation  (see  Ref.  1),  the 

only  term  which  will  contribute  to  intersymbol  interference  will  be  the  cubic  term 


4 cos  3 

nW'i? 

1-4-4 

T 


Consider  the  expansion 


4(t)  = Z 


Itl 


= 0 otherwise 

where  {02^(c,t)]  are  the  prolate  spheroidal  wave  functions  generated  by  the  integral 
equation 

tt/T  . , 

Xn(c)  0j^(c,v)  = f 0n(c,x)dx  n=0,l,2,---  (6 

-tt/T 

These  functions  satisfy 


f 0 (c , v)  0 (c , v)  dv  = 6 
J.n'  m mn 


,tr/T 

-ir/T 

where  6 is  the  Kronecker  delta.  In  addition, 
mn 

TT  j T 

2(j)" /TIFT  0^(c,<*))  = /''  e-"“^  0 (c,t)dt  (9) 

n n / rr'  n 

-tt/T 

In  view  of  the  properties  of  PSWF,  it  can  be  shown  that  the  coefficients  in  Eq.  (5) 


f-n" 

-~mS 


2[X,„(c)]^'^  -tt/T 


where 


Y.(a,)=  [y.(t)]  ^ 4(w)  = Z (IklT  "2k 
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and  s are  the  moments  of  y.(t).  Using  the  mean-square  error  criterion, 
2k  i 

. 2 

I [y.(t)  - y.(t)rdt 


2 -tt/T  (12) 

= = — — 

/ yf  (t) 

-ir/T  ^ 

one  needs  to  find  a bound  on  the  numerator  of  Equation  (12).  ThTs  numerator  may  be 
written  in  the  form 


tr/T  ^ 2 

/ ^ !l<^2n-^2n)  + ^ “2n 

-tr/T  n=o  n=N+l 


where  the  orthogonality  property  of  the  PSWF  on  [-  ^as  been  used.  But 

TT  / T ^ 

l-2„  ■ >2„l  = 7T7T  { / , 

and,  applying  Schwarz'  inequality  to  Eq  . (14),  yields 

<»2„- »2„ii 

ir  *r 

= f [Y.{i;j)  - Y.(oj)]^dv 

\Ijc)  -tt/T  ' 


Since 


(2k+2) 

[Y.(ui)  - Y^(j)]  < ^2k+  2Tl'  ®2k+2 
where  use  has  been  made  of  Eq.  (11) 


1 


4k  + 6 


2n  2n  - [(2k  + 3) l] (4k  + 5) 

and.  introducing  a mean-square  error  for  Y.(c.)  of  the  same  form  as  Eq.  (12),  re- 

o 

suits  in 

tt/T  ->  7 7 

f 'y.(t)  - a e(c,t)l^dt<  12e^  j y • (t)  dt 

■’-./t'"'  „7o  " " ' "»  -./T 


which  leads  to 
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y a^  < 12e^ 

-J  2n  — JO  ■ 

c ° 

-^/T 

-ir/T 

■ f-'' 

(19) 

n=-+l 

yr 

2 

The  estimation  error,  e , is  then  bounded  by 

1 


e2< 


4k+  6 


[(2k+3)l]"(4k+5)  7"'^''y2(t)dt 
-tt/T  ^ 


(20) 


A pessimistic  estimate  of  this  error  is  obtained  by  using  an  upper  bound  for  and  a 

2 2 

lower  bound  for  J /-t- y-  (t)  dt  in  Equation  (20).  Typically  e < . 01  and  using  lower 

2"  ' 2 -2 

bound  on  y^  (t)  dt  results  for  c = 1 and  k=  1 in  e <3.46x10 

B.  Some  Comments  on  Elimination  and  Boundedness  of  Intersymbol  Interference 

Following  the  same  reasoning  as  in  Ref.  2,  it  can  be  shown  that  it  is  not  always 
possible  to  eliminate  intersymbol  interference  by  linear  filtering  when  a memoryless 
nonEnearity  is  present.  As  a matter  of  fact,  the  theorems  given  in  the  reference  are 
still  valid.  It  remains  to  demonstrate  that  the  boundedness  of  the  intersymbol  inter- 
ference is  guaranteed. 

Consider  the  cubic  term  of  the  output  of  the  nonUnearity  followed  by  a linear  filter 
with  impulsive  response,  h(t).  The  output  of  the  filter  is  obtained  by  convolving  y^(t) 
and  h(t).  At  any  sampEng  instant  t=nT,  the  output  of  the  Enear  filter  is 


NT 

z(0)  = / y.(t)h(-t)dt 

-NT  ' 

Applying  Schwarz'  inequaEty 

? NT  - 00 

[z(0)]  < J y.  (t)dt  f h^^(-t)dt 
-NT  ^ -00 

There  exists  an  M such  that 

Ml 


(21) 


x^l  > iyi(t)| 


(22) 


(23) 


NT  , NT  p . 

/ y^(t)dt  < f [- 
-NT  -NT  ^ 


cos  2M 


dt 


4t 

7^ 


(24) 
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from  which  it  follows  that 


I 


NT 


-NT 


4 

2M 

dt  < 

NT 

f 

4 

1 

TT 

O 

-NT 

TT 

2M 


dt 


(25) 


From  Eqs.  (22)  and  (25),  we  conclude  that  intersymbol  interference  is  bounded 
provided  that  h(t)  is  square -integrable.  If  the  nonlinearity  is  more  general  than  cubic, 
the  boundedness  of  the  inter  symbol  interference  is  guaranteed  if  the  new  y^(t)  satisfies 
Equation  (23). 
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BIVARIATE  DEPENDENT  SEQUENTIAL  PARTITION  DETECTORS 
R.F.  Dwyer  and  L.  Kurz 

The  purpose  of  this  report  is  to  extend  the  theory  of  partition  detectors  to  include 
bivariate  dependent  sequential  partition  detection.  The  extension  is  of  particular  use 
in  picture,  radar  and  underwater  sounding  data  processing.  General  formulation  of  the 
problem  is  followed  by  application  to  the  shift-of -the -mean  and  Lehmann  alternatives 
model. 

A.  Structure  of  Bivariate  Dependent  Sequential  Partition  Detectors  (BSPD) 

Let  z.  = (x.  ,y.),  i = 1 , 2,  • • ■ , n be  an  i.i.  d.  sample  from  a stationary  bivariate  ran- 
dom process  with  c.  d.  f.  F(x,y).  Define"^=  (a^=  -oo,  ,•••, a^_j , a^  =+«>)  as  an  or- 
dered vector  of  quantiles,  estimated  with  independent  samples  for  c.d.fF(x).  Assume 
F(x,y)  = F(x)  F(y)  when  independent,  and  a partitions  the  space  into  m mutually  exclu- 
sive cells  in  R^. 

Define  n as  the  number  of  observations  falling  into  the  two-dimensional  cell  (k,  j), 
kj 


a,  .<x.  <a,  , a.  <y.  <a.1;i  = l,2,---,n;  k = l,2,---,m,  j = 1,2,*-*  ,m] 
k-1  1—  k j-1  '1— 


and 


^Lkj  = ^Lkj^'^i^  = ^lK-I^^IV 


as  the  probability  of  the  joint  event  where 
mm 

ZZPLkj=‘  = ” 

k j k j 

From  a direct  generalization  of  the  one -dimensional  m -interval  case,  the  variable 

n,  . are  multinomially  distributed  (see  Ref.  2,  pp.  168),  with 
kj 


kj 


P^(^)=  P(n^j,nj2, 

••'*'lm'* 

• • , n ) 

mm 

nl 

m 

m 

"ll'"l2-  ■■ 

• n.  . I • • • 
kj 

"mm'  k 

j 

Define 


ppi(n)-,  mm  r^i<^k-i<^i- V ^j-i^yj-^jh 

as  the  loglikelihood  ratio  for  (Zj,  z^,  • • • , under  and  Hj. 


(1) 


.Al 
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Let 


ikj 


where 


= { 

ikj  t. , . 


if  »i4  Ikj= 


1 if  z.elj^. 


Then, 


n m xn 


■^n  ^ SEE  "ikj\j 

i k j 


(2) 


where 


Equation  (2)  represents  the  bivariate  sequential  partition  detector  (BSPD).  For 
the  data  sequence  ’ ’^n‘  ^n’  sampling  if  b < < a and 

terminates  with  acceptance  of  or  Hj  if  < b or  > a,  respectively. 

Define  the  random  variable. 


T. 

1 


m m 

= EE%^kj  • 

k J 


(3) 


with  moment  generating  function 
|j(t)  = E[e^^i] 

Following  the  usual  procedure  for  sequential  tests  as  in  Eq.  (3),  a value  of  t=t^  4 ® 
be  found  so  that  0(t)  = 1,  and 


t 

o 

n-^oo 


2 


E(TJ 


The  average  sample  number  for  BSPD  is 

t a t b 

L , Me  ° - 1)  + a(l  - e 1 i 0 = 


A.SNB  = 


E(T.)  = 0 
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B.  BSPD  Under  a Bivariate  Lehmann  Alternative 

Let  the  dependence  exist  under  the  hypothesis  and  alternative  Hj.  Assume  the 
signal  (represented  by  6 as  the  SNR)  shifts  the  two-dimensional  space  according  to  the 
relationship^ 

H^:F(x,y) 

1 + 6 

:G(x,y)  = F(x,y)  » F(x,y)  + 6F(x,y)  • in  F(x,y)  < F(x,y) 

The  quantiles  are  assumed  to  be  estimated  under  the  univariate  distribution  F(x). 
However,  now  the  scores,  will  reflect  the  bivariate  dependence. 

Define 

^okj  = + ^<^k-i'^j-i>  - 

as  the  probability  of  a sample  falling  into  cell  (k,  j).  Then,  the  probability  of  falling 
into  cell  (k,j)  under  the  selected  alternative  and  the  alternative  (Hj^)  are  given  by, 

respectively, 

Ikj  okj  1 kj 


P,  . » P . . + 6 A.  . 
kj  okj  kj 

where  6 and  6^  have  the  same  meaning  as  in  the  one -dimensional  sequential  partition 
detector  discussed  elsewhere  in  this  report. 

^kj  = + ^<^k-i’^j-i)  ^K-I'^j-1> 

- F(a^_^,a.)  in  F(aj^_j,a.)  - F(a^,a._j)  in  F(a^,a._^) 

The  mean  and  variance  of  T^  can  be  shown,  straightforwardly,  to  be 
1 "y  S3 o 

\ — i c e A ^ A “ / t A\ 


= V^okj 

k j 


,,  - in  m ^ 

°T.  " ^1  ^ E -^k/^okj 
^ k j 

Then,  the  solution  of  ^(t)  = 1,  t 4 becomes. 
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E(T  ) 

t = = - 2 (1  - 26/6j)  = T(6) 


This  is  an  interesting  result;  t(6)  is  free  frcwn  any  bivariate  dependence.  The 
thresholds  remain  constant  along  with  a and  P,  which  is  a distinct  advantage. 

Instead  of  the  thresholds,  the  scores,  b , will  have  to  be  adjusted  to  compensate 

•’‘J 

for  the  amount  of  dependence  introduced  in  any  particular  environment.  Even  though 
the  BSPD  is  restricted  to  a sampling  rate  of  2/a\  where  is  the  effective  bandwidth, 
the  form  and  cimount  of  dependence  need  not  be  known  per  se.  This  means  BSPD  main- 
tains constant  a,  p under  unknown  dependent  noise  conditions  as  long  as  the  bivariate 
composition  is  maintained. 

C.  BSPD  Under  a Bivariate  Shift  Alternative 

4 

Assume  the  scime  conditions  hold  under  a shift  alternative  as  in  B.  Then, 
H^:F(x,y) 

: F(x,  y)  = F(x  - A,  y - A)  w F(x,  y)  - ^ ^ F(x,  y) 

Hjj  :G(x,y)  » F(x,y)  - A^  F(x,y) 

where  the  quantiles  again  will  be  estimated  under  the  univar’.ate  distribution  F(x). 
Proceeding  as  before, 

^okj  = - ^<^k-i'^j)  - 

^Ikj  *"okj  * ^l®kj 

P,  . » P , . + AB,  . 
kj  okj  kj 


where 


®kj  = + V^k-1' V ^xK'^j-1^  ^ Vv^j-i> 

- - ^x<^k-l*^j-l>  - V^k-l*^j-l)^ 


.A 
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i 

I 


x=  a, 


y=a. 


The  mean  and  variance  of  T^  can  be  written  directly i 


E(T.)  = (AAj 


. T mm  ■, 


k j 


okj 


2 a2 

^T.  " ^1 
1 


SEb 

k j 


kj^^okj 


The  solution  for  t = t^  4 ® becomes 


E{T.) 

t =t  = - 2 — (1  - 2A/Aj)  = t(A) 

n-^oo  Orj. 

i 

where  the  A and  A ^ have  the  same  meaning  as  in  the  one -dimensional  sequential  test 
discussed  elsewhere  in  this  report. 
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RELATIVE  PERFORMANCE  OF  SEQUENTIAL  AND  FIXED  SAMPLE  SIZE  PARTITION 

DETECTORS 

R.F.  Dwyer  and  L,  Kurz 

The  main  purpose  of  this  report  is  to  extend  the  theory  of  nonparametric  partition 
detectors  to  include  sequential  operation.  To  that  end,  the  observation  space  is  parti- 
tioned into  m intervals  based  on  the  knowledge  of  only  a finite  number  of  quantiles  of 
the  noise  distribution,  which  can  be  easily  estimated  using  the  techniques  presented  in 
References  1 and  2,  The  ease  of  implementation  and  robustness  of  the  m-interval 
detectors  make  the  sequential  partition  detectors  (SPD)  particularly  attractive. 

Specifically,  a formulation  for  the  SPD  is  given  and  then  test  statistics  are  con- 
structed for  the  Lehmann  and  shift-in-the -mean  alternatives.  Efficiency  expressions 
are  derived  and  maximized  by  finding  the  optimal  quantiles  of  the  noise  distribution. 
Asymptotic  properties  for  fixed  sample  size  m-interval  partition  detectors  are  derived, 
and  measures  of  efficiency  are  introduced  which  compare  the  fixed  sample  size  detector 
with  the  SPD.  An  important  theorem  is  proven  which  shows  that  the  optimum  quantiles 
which  maximize  the  efficiency  for  fixed  sample  size  and  sequential  partition  detectors 
are  the  same.  The  latter  fact  permits  the  use  of  Fisher's  information  as  a bound  on 
performance  and  assures  the  existence  of  a locally  most  powerful  (LMP)  SPD.  It  is 
shown  that  an  advantage  of  the  SPD  over  its  fixed  sample  size  detector  counterpart  is 
that  it  can  be  up  to  four  times  more  efficient  based  on  the  partition  asymptotic  relative 

efficiency  (PARE). 

A,  Formulation  of  a Sequential  Partition  Detector  (SPD^ 

Following  Ching  and  Kurz^  SPD  will  be  formulated  by  partitioning  the  observation 
space  into  a finite  number  of  regions  based  on  the  quantiles  of  the  noise  distribution. 

Before  discussing  the  SPD  in  detail,  a few  remarks  are  needed.  It  is  well 
known^’  that  a sequential  test  based  on  the  likelihood  ratio  will  have  the  optimum 

property  at  0q  (signal  strength  under  the  hypothesis,  Hq)  and  (signal  strength  under 
the  alternative,  H^).  By  estimating  the  quantiles  of  a noise  distribution  under  and 
specifying  an  alternative  sensitive  to  the  signal  strength  0^ , a likeUhood  ratio  statistic 
is  formed  which  is  invariant  to  changing  distributions.  Since  only  the  quantiles  of  the 
distribution  need  to  be  known,  the  form  of  the  sequential  test  remains  the  same.  There- 
fore, the  objective  of  this  section  is  to  formulate  a procedure  which  will  minimize  the 
average  sample  number  (ASN)  for  the  SPD  as  a function  of  the  number  of  quantiles 
estimated.  Let  E[n/0]  and  EpCn/O]  represent  the  ASN  for  the  SPD  and  the  optimum 
SPD,  respectively.  We  seek  min[E(n/0)]  > min[E^j(n/0)],  where  n is  a random  variable 
representing  the  number  of  samples,  and  as  Jdm^[min  E(n/0)]  min  [EQ(n/0)3,  where 
(m  - 1)  are  the  number  of  quantiles  estimated. 


i j 

i 1 
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Assume  Xj , x^,  • • • , is  a sample  of  i.  i,  d.  r.  v.  with  a a.  c,  d.  f.  F(x)  and  let 

a = (a  = -00,  a, , • • • ,a  . . a = oo)  be  the  quantiles  of  F(x).  Let  n,  be  the  number  of 
'0  ’l’’m-l  nB  ^ k 

observations  falling  into  the  interval  (a^^  addition,  introduce 

^vk  = - ^vK-l>  ' 

representing  the  probability  of  a sample,  Xj^,  falling  into  the  k-th  interval  under  the 
hypothesis  or  alternative  subject  to 


111  *** 

t ^vk  = ^ t 


n,  = n 
k 


The  likelihood  ratio  statistic  upon  which  the  SPD  is  based  become 


_ k=r^^ 

P (n)  m n. 

^ TT  P ^ 
k=l 

It  is  easy  to  show  that  an  equivalent  test  to  the  one  given  by  Eq.  (1)  is  of  the  form 
n m n 

i=lk=l  i=l 


where 


n,  = y n., 
k -J  ik 
i=l 


bk  - -*"(Pik/^0k^ 
are  the  scores . 

Equation  (2)  represents  the  classical  cumulative  sum  form  of  the  SPD  with  stop- 
ping boundaries  given  by 

b = in  B < T < in  A = a (3) 

n 


accumulated  if  T fall  between  a and  b. 
n 
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B.  Constructing  SPD's  Based  on  Selected  Alternatives 

The  problem  now  is  to  find  a suitable  alternative  relating  Pj^,  and  Pqj^  to  a 
meaningful  parameter,  6,  which  separates  the  hypothesis  from  the  alternative.  Two 
such  classes  of  alternatives  will  be  discussed  here;  the  Lehmann  alternative  (a  discus- 
sion of  this  class  is  given  elsewhere  in  this  report)  and  shift -in -the -mean  alternative. 

1 . Lehmann  Alternative 


Lehmann  proposed  a one  parameter  family  of  nonparametric  classes  of  alterna- 


G(x)  = 


where  G(x)  < F(x)  for  6 > 0.  For  6 small 

Hq  : F(x)  and  Hj  :G(x)  « F(x)  -1  6 F(x)  in  F(x) 


The  parameter  6 represents  the  unknown  true  separation  between  the  hypothesis  and 
alternative  under  the  Lehmann  class.  The  objective  of  the  SPD  is  either  to  accept  or 
reject  the  hypothesis  or  accumulate  another  sample  under  the  unknown  separation  6 
given  a preselected  (usually  based  on  the  minimax  criterion)  separation  6^.  In  this 


^k  = ^Ok  + ^ ^k 


bj^=  in(l  + p- 


Following  the  procedure  of  Ref.  3 

E[T,].<n(l  + 6,A^/P„^)(P„^+6A^| 


4.  = * ‘i  VPok»"  'Pok  * ‘ ^k>  - 

1 


which  reduce  for  6j  and  6 small  to 


E[T,]  = (6  6,  .i6j)  1 
k=l 


COMMUNICATIONS 


361 


4.  - 4 S ^k/Pok 

1 k=i 

5 7 

Following  Wald  and  Bahadur 

t^(6)  = 1 - 26/6 j 


and  the  performance  measures  - probability  of  terminating  the  test  and  ASN  - are 


to(6)b 


P(6)  = 


-iv:  i -s  ~2~ 

k=l 


at  (6) 

b(e  - 1)  + a(l  - e ) , / 

iTT6l htjb)  * 


‘>0  s V'’c 

k=l 


. 6 = 6^/2=  6^ 


It  is  interesting  to  note  that  in  Eqs.  (10)  and  (11)  6 and  6j  do  not  depend  on  the  quantiles. 
This  means  that  E[n/6]  can  be  minimized  by  maximizing 

m - 

1 4/^ok  ■ 


2,  Shift  Alternative 

In  this  case  the  alternative  is  related  to  the  hypothesis  by 
G(x)  = F(x  - A) 


r 


1 
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and 


J 2 ^ 

k=i  F(V  - "'^^k-i'^ 


(13) 


which  are  similar  to  Eqs.  (8)  and  (9)  except  that  the  slope  of  F(x)  or  the  p.d.f  is  involved. 
If  one  identifies 


^k/Pok  - F(aj^)  -Flak.i)" 


(14) 


the  optimization  procedure  for  Lehmann  alternatives  may  be  followed. 

C.  Measures  of  Efficiency  for  Sequential  and  Fixed  Sample  Size  Partition  Detectors^ 

One  of  the  important  results  of  comparison  between  SPD  and  fixed  sample  size 
partition  detectors  may  be  summarized  in  the  form  of  a theorem. 

Theorem  1:  The  optimum  quantiles  which  maximize  the  asymptotic  relative  ef- 
ficiency (ARE)  for  the  SPD  and  the  fixed  sample  size  partition  detector  under  Lehmann 
and  shift  alternatives  are  the  same. 

Proof:  See  Reference  9. 

This  is  an  important  result  which  assures  optimization  (for  some  m)  of  both  the 
SPD  and  fixed  sample  size  partition  detectors  for  the  same  test  statistic  under  identical 
alternatives.  Another  interesting  point  is  the  efficacy  of  the  SPD  is  bounded  by  Fisher's 
information  the  same  as  for  the  fixed  sample  size  partition  detectors.  Thus,  a locally 
most  powerful  SPD  exists  based  on  quantiles  or  as  m <«,  the  SPD  approaches  the  best 
among  all  locally  optimum  sequential  tests  for  the  same  noise  environment.  Since  the 
quantiles  are  estimated  under  ambient  noise  conditions,  the  SPD  possesses  the  desir- 
able property  of  being  able  to  adapt  to  changing  noise  conditions  and  maintain  its 
optimum  quantities.  Other  sequential  tests  designed  under  known  distributions  may 
perform  poorly  in  a changing  noise  environment. 

The  disadvantage  of  the  Pitman -Noether  ARE  approach  as  a performance  measure 
is  that  it  does  not  show  the  dependence  on  a and  P over  a range  of  signal  strength,  0. 
Bahadur  and  Hodges -Lehmann  efficiencies  alleviate  this  difficulty  somewhat. 

However,  a better  approach  is  needed  when  comparing  sequential  and  nonsequential 
detectors . 

Consider  a detector  of  the  form 


ifc 
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N 


(15) 


as  in  Equation  (2).  If  a and  P are  small  then 


_ [ V-2  in'g  + V-2in  p 
, m , 

6^  ^ A^/P, 


1 


k=l 


Ok 


(16) 


The  ratio  of  N/E(n/0)  as  a,  P 0 will  be  defined  as  partition  asymptotic  relative  effi- 
ciency (PARE) 


PARE  = 


lim 

a=P-»0 


N 

E(n/0) 


lim 

a=p^O 


- j iJ-Zin  Q + V-2fn  P]^tJ0) 


b(e 


at  (0) 


- 1)  + a(l  - e 


b't  (0)" 
o'  \ 


-ryeir 


b"TTW 


(17) 


The  PARE  has  similarities  with  the  Pitman-Noether  ARE  in  that  both  let  the  respective 
sample  sizes  -»oo  but  in  different  ways.  The  Pitman -Noether  ARE  lets  Hj  ->  Hq  while 
fixing  a and  p and  PARE  lets  a,p  ^ 0 for  some  0.  However,  both  are  independent  of  a 
and  P asymptotically. 

Instead  of  letting  a = p ->  0,  it  is  convenient  to  let  a = [b]  oo  thus  obtaining 


4 


PARE  = 


4 

-»oo 


0 


to(6)  = 1 

t^(0)=  -1 

t^(0)  ^ -00 

t (0)  = t (0  ) 
o'  ' o'  c 


(18) 


At  0=  0Q  or  0=  0j  the  SPD  is  four  times  as  efficient  in  terms  of  PARE  as  its  fixed  sample 
size  counterpart.  For  0q  < 0 < 0^,  the  SPD  loses  its  advantage  over  the  fixed  sample 
size  detector. 

In  practical  situation  one  wishes  to  have  finite  truncation  time.  In  the  latter 

situation  procedures  suggested  by  Bussgang  and  Middleton, Weiss,^^  Anderson, 

13  14 

Kiefer  and  Weiss  and  Bussgang  and  Marcus  are  applicable. 
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A FLEXIBLE  MODEL  FOR  IMPULSIVE  NOISE 
L.  Kurz 

In  most  data  transmission  systems  the  noise  distribution  is  appropriately  modelled 
by  the  mixture  distribution  consisting  of  gaussian  and  impulsive  noise.  Though  the  avail- 
able data  on  statistical  behavior  of  impulsive  noise  is  limited,  it  is  generally  recognized 
that  the  impulsive  noise  may  be  represented  by  clusters  of  impulses  with  near -Poisson 
distribution  of  occurrence  and  amplitude  distribution  with  large  variance.  In  view  of 
this  limited  knowledge,  representations  and  procedures  are  suggested  here  which  per- 
mit to  represent  the  impulsive  noise  and  mixture  distribution  by  a flexible  model  both 
in  terms  of  occurrence  and  amplitude  distributions.  In  addition,  a recursive  procedure 
for  estimating  the  mixture  parameter  is  also  given. 

A.  A Model  for  Amplitude  Distribution  of  Impulsive  Noise 

In  this  section,  a one -dimensional  form  of  a representation  theory,  which  results 
in  good  fit  with  small  n\imber  of  terms,  will  be  derived.  The  representation  theory  is 
based  on  the  concepts  leading  to  the  Gram -Charlier  type  A series  and  is  similar  to  the 
development  of  Ref.  1 for  dispersive  channels. 

Consider  a generalized  Fourier  series 


p(x)  = ^ o <t>^(x)  p(x) 

1=  1 

where  p(x)  is  some  property  preselected  reference  p.d.f.  and  p(x)  represents  an  esti- 
mator of  the  actual  p.d.f.  The  choice  of  the  orthonormal  set  corresponds  to  the 

2 

choice  of  p{x)  and  is  based  on  the  approach  given  by  Szego. 

As  suggested  by  Algazi  and  Lerner,  a good  choice  for  p(x)  is 


p(x)  = a e 


where 


(2) 


. 1 < — < 1 
n 

Following  Szego^  the  orthoncrmal  polynomials  {4>^(x)l  are  generated  from  the  moments, 

m . of  the  reference  distribution  by 
n 


: ■-•S  ■ 
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= (D 


n-l°n> 


1 

I 


B 

O 

mj 

m^  

m 

n 

mi 

*”2 

m^  . . . . 

ni 

n+1 

”’n-] 

m 

n 

m . , • • • • 
n+1 

2n- 

1 

X 

2 

X . . . . 

n 

X 

■where  for  n > 0 

D = fm  , 1 , = 0, 1, 2,  • • • ,n 

n L v+|iJv,|i  ’ 

The  first  four  orthogonal  polynomials  are  then 


= 


2 "’2 
X - ■■*■■■  ■ 


J^4'^Z  J- 


m . - 
4 2 


<t>3(x) 


3 

X - 


J 


2 2 
2 6 2 4 


j 


2 6 2 4 


4>4(x)  = 


m . - m 


4 “'2  i 4 . r™4""2  ‘”’612  T 

■'Lt: — rz“J^  'U 


4 2 


m . + nio 

4 2 


m .irj^  - m, 
4 2 6 


m . - 
4 2 


]} 


where 


c = nig(m^  - m^)  + 2mg(m^m2  - m^)  + m^[^ 


m - m, 
4 2 6 


m . - 
4 2 


] 


r 2 n2 

-m  m -m,-|  , Lm.-m,m,J  ^ 

[-i— ? ^ 2(m^-  m2m^)m^ 


m . - 
4 2 


m . “ m- 
4 2 


If  the  unknown  p.d.f,  p(x)  is  of  the  form 
p(x)  = aj  exp  - bj  jxj 
and  the  reference  p. d. f . 
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p(x)  = exp  - b^jx 


l/n 


(5) 


with 


N 


p(x)  = Yj  PW 

i=l 


(6) 


then  it  can  be  shown  that  for  the  approximation  to  be  valid 

Typically,  N = 4 gives  the  relationship  between  n and  m for  an  acceptable  approximation. 
Consider  the  mixture  distribution  noise  model  with  p.d.f. 

PmW  = + ‘IPiW  (8) 

where  Pjjj(')»  Pg^'^  Pj(')  3-*'®  amplitude  p.d.f.  of  mixture,  gaussian  and  impulsive 

noise  components,  respectively. 

If  p (x)  is  of  the  form 
m 

p (x)  = a exp  - b 1x1 
'^m'  m m ‘ ' 


and 

p^(x)  = a^  exp  - bjx| 

it  is  easy  to  show  that,  for  a given  q and  n,  the  value  of  v such  that 

(s^_j)!  > (9) 

For  instance,  for  N = 4 and  n = 1,  for  an  actual  q = . 03  and  v = l/2,  the  bound  Eq.  (9) 
yields  q = . 053. 

The  one -dimensional  representation  may  be  extended  to  two  or  more  dimensions 
following  the  procedure  of  Reference  4. 

5 

Masry  points  out  that  the  two-dimensional  expansions  of  this  type  are  not  unique 
nor  need  they  converge  to  the  actual  joint  p.d.f.  This  convergence  is  not  needed  in 
practical  situations.  What  is  needed  is  that  a finite  number  of  terms  in  the  series 
represent  the  desired  p.d.f.  adequately  for  a given  application.  The  latter  goal  is  al- 
most always  achieved. 


1 
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B.  Representation  of  the  Time  of  Occurrence  of  Impulsive  Noise  and  Estimation  of  the 
Mixing  Parameter 

Following  the  philosophy  o£  Section  A,  the  frequency  of  occurrence  of  the  noise 
impulses  may  be  described  by  a Gram-Charlier  type  B series.  ' Using  the  Poisson 
distribution  as  the  reference  measure,  the  frequency  distribution  of  the  impulsive 
noise  may  be  represented  by 

P(x=  k)  = £^(k;  X)  f a.  C.(k;  X)  (10) 

i=0 


where 


f^(k;X)  = x'^e’Vk! 


C,(k;X)  = U-l|*'’^\x) 


1 \ V.  , «- 


^ X''i!  dX*  ■- 

If  a finite  number  of  terms  of  Eq.  (10).  say  K,  represent  adequately  the  frequency  of 
occurrence  of  the  impulses,  aU  the  parameters  necessary  for  this  representation  may 
be  obtained  by  determining  or  estimating  the  K+  1 moments  of  the  distribution  following 
a procedure  similar  to  the  one  presented  in  Section  A. 

In  general,  the  problem  of  finding  a suitable  estimator  of  the  mixing  parameter, 
q,  based  on  independent  signal-free  samples,  is  a problem  of  optimizing  a smtable 
functional. 

It  is  easy  to  show  that  for  the  model  of  Eq.  (8),  using  the  minimum  integral- 
square  error  criterion,  the  problem  reduces  to  minimization  of 


00  2 
i(q)  = / tp„(*)  - (1  -q)Pg(^)  - qPi(’‘)3 
-00  ^ 

which  following  the  usual  procedures  of  the  calculus  of  variations,  yields 


where 
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c.  = / (x)  - fjW]  dx 

-00  " *■ 

and 

00 

<=2  ■ / 

• 00  ® ® 

Introducing  a random  function  Zp  (|.x)  defined  as 

if|<x 

Zp  (l.x)  = { 

m 0 otherwise 

it  can  be  seen  that 

EfCZp  (4.x)]=P^{x) 
m 

where  P^(x)  is  the  c.d.f.  of  the  mixture  distribution.  Thus,  the  regression  equation 

oo^lZlp  (l.x) 

E^ty(q.x)/q]  = E^[j  ^ g [p^(x)  - p.(x)]dx  + qCj  + 

= - Pi<x)l  + ^^1  + ^2  (13) 

which  is  equivalent  to  finding  the  minimum  of  Equation  (11). 

The  problem  of  finding  q of  Eq.  (12)  is,  therefore,  equivalent  to  the  problem  of 
finding  the  zero  of  regression  Equation  (13).  Following  the  Robbins -Monro  procedure,*^' 
an  iterative  solution  for  q is  obtained  from  observations  {y.)  in  the  form 


^i+1  = ^i- 

y ) 

1 ^ 

(14) 

where  a.  = a/i  and  a > 0. 
1 ' 

Noting  that  qe  [0,  1],  the  random  function  y(q^,x)  is  redefined  as  in^^ 

q.  < 0 

y(q^.x)  = 

Cfg(x)  - T(x)]  + q^Cj+  C2  if 

0 < qi  < 1 

(15) 

+Cq  if 

qi  > ‘ 

where  C^  is  a suitable  positive  constant.  It  is  readily  shown  that  with  the  choice  of 
y(qi»x)  given  by  Eq.  (15),  all  conditions  for  the  Robbins-Monro  algorithm  are  satisfied. 
Thus,  the  recursive  estimator  of  Eq.  (14)  converges  to  the  optimum  estimator  in  mean 
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square  sense  and  with  probability  one.  The  usual  acceleration  procedures  are  also  ap- 
plicable. 

In  practical  situations,  the  L(x)  of  Eq.  (15)  is  not  known.  In  the  latter  case,  L(x) 

may  be  replaced  by  as  found  using  the  method  of  Section  A. 
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SUBOPTIMUM  DETECTION  IN  NONSTATIONARY  GAUSSIAN  NOISE 
E.  Kurz 

The  problem  of  detection  and  estimation  in  nonstationary  gaussian  noise  has  been 

considered  by  many  researchers.  The  approaches  used  to  solve  the  problem  lead  to 

complicated  mathematical  formulations.  Some  of  the  difficulties  become  clear  if  one 

considers  just  one  such  approach  presented  elsewhere  in  this  report.  The  purpose  of 

this  paper  is  to  develop  a suboptimiun  procedure  to  solve  the  problem.  In  particular, 

1 2 3 

an  approach  originally  suggested  by  Tukey  and  modified  by  Priestley  ’ is  used  to 
generate  a set  of  observables  which  permit  a reasonable  solution  to  the  problem.  In 
addition,  the  techniques  suggested  here  help  one  understand  better  the  physical  phe- 
nomena associated  with  the  actual  detection  and  estimation  process. 


A.  Generation  of  the  Observables 

As  the  first  step  in  the  development  of  a meaningful  procedure  for  comparing  two 
different  gaussian  stochastic  processes,  an  evolutionary  spectral  estimator  is  generat- 
ed from  the  observed  waveshapes.  This  procedure,  originally  suggested  by  Tukey 
involves  a process  of  complex  demodulation.  The  actual  complex  demodulation  is  per- 
formed in  two  steps: 

1)  The  observed  sample  function  of  the  gaussian  stochastic  process,  x(t),  is 
demodulated  by  multiplying  it  by  exp(-ju)pt),  where  is  the  demodulation  radian  fre- 
quency. 

2)  The  demodulated  version  of  the  observed  sample  function  is  smoothed  by  a 
low-pass  discrete  linear  filter,  with  impulse  response  hj^  producing 

K 

V(t)  = Yj  hj^X(t  - k)  exp[-ju>Q(t  - k)]  (1) 


where  V(t)  is  the  complex  demodulate. 

2 

Priestley  improved  on  the  Tukey's  procedure  by  introducing  a second  low-pass 
filter  to  the  modulus  of  the  complex  demodulate.  The  evolutionary  spectral  estimator 
at  time  t and  radian  frequency  Wq  is  then 


^,x<“0 


K 

)- 

k=l 


Wj^  V(t-k)  V 


(t-k) 


(2) 


where  Wj^  is  the  second  low-pass  smoothing  filter. 

The  evolutionary  cross -spectrum  is  estimated  as  follows: 

1)  For  each  sequence  X(tj),  X(t2),  • • • , X(t^)  calculate  the  complex  demodulate 
(t)  using  Equation  (1) 
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2)  Forrn  an  nxn  matrix  whose  elements  are  the  products  Vjj^(t)  Vjjj^(t),  where 
i.k  = 1,2,  • ■ • ,n 

3)  Smooth  the  matrix  of  cross-products  with  a low -pass  filter  w^^. 

Using  these  steps,  the  evolutionary  cross -spectral  density  matrix  is  generated  with 
terms  of  the  form  given  by  Equation  (2).  The  evolutionary  cross -spectral  density 
matrix  may  be  treated  as  an  estimate  of  a multivariate  normal  covariance  matrix. 

The  two-stage  filtering  process  suggested  by  Priestly  may  be  considered  as  cor- 
responding to  a two-dimensional  filter.  The  first  stage  represents  the  same  operation 
as  encountered  in  AM  receivers.  This  operation  singles  out  the  neighborhood  of  one 
frequency  for  analysis.  Once  the  time -varying  power  in  the  neighborhood  of  this  fre- 
quency is  isolated,  the  second  filter  is  used  for  smoothing  purposes.  The  choice  of  the 
parameters  of  the  two-dimensional  filter  is  presented  in  Ref,  3 and  will  not  be  given 
here. 

Consider  a univariate  evolutionary  process  with  sample  function 

X(t)  = c(t)  4<(B)  c(t)  (3) 

where 

a for  te  T.  for  all  odd  i 

c(t)  = { 

3 for  te  T.  for  all  even  i 
^ 1 

and  a,  p are  real  constants.  In  Eq.  (3),  4j(B)  is  a rational  function  of  B and  £{t)  is  a 
sample  function  of  a white  noise  process.  The  product  4'(B)e(t)  may  be  interpreted  as 
the  stationary  part  of  X(t)  which  is  modulated  by  c(t).  The  design  of  the  evolutionary 
process  estimator  requires  knowledge  of  the  change  in  c(t)  with  respect  to  time  and  the 
change  in  the  stationary  part  4/(B)E(t)  with  respect  to  frequency.  These  two  character- 
istics of  X(t)  are  measured  by  a time  domain  bandwidth,  B^,  and  the  more  traditional 
frequency  domain  bandwidth,  B^,  respectively.  The  time  domain  bandwidth  may  be 
considered  to  be  a measure  of  the  maximum  time  interval  over  which  X(t)  may  be  treat 
ed  as  approximately  stationary. 

The  T.  are  selected  as  interarrival  times  of  a Poisson  process  with  a mean  X. 

1 

The  Poisson  assumption  implies  that  the  only  available  information  on  the  occurrence 
of  the  next  event  in  the  knowledge  of  the  mean  time  of  such  occurrences.  For  further 
details  of  the  design  procedure  one  should  consult  Reference  3. 

B.  The  Detection  Problem 


The  detection  problem  of  comparing  two  evolutionary  processes  is  posed  as  a 
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problem  of  hypothesis  testing.  In  particular,  given  a sample  function  of  a process  X(t) 
known  to  be  generated  by  a linear  transformation  of  one  of  the  two  processes  Y(t)  and 
Z(t)  for  te  Tj,  then  X(t)  - Y(t)  is  the  hypothesis,  Hq,  and  X(t)  - Z(t)  is  the  alternative, 
Hp  As  the  test  statistic  to  test  the  hypothesis  one  introduces 

■ ^xz/y  (4) 

where  the  r's  are  the  appropriate  partial  coherence  coefficients.  These  coefficients 
are  calculated  using  Priestley's  procedure.  The  estimates  of  r's  following  this  pro- 
cedure are  then 

where  ^ (t)  is  a sample  function  of  colored  gaussian  noise.  The  process,  4 (f)»  is  related 
to  the  white  gaussian  process,  e(t),  by 

N(B)4(t)  = D(B)e(t)  (6) 

where  N(B)  and  D(B)  are  polynomials  in  B.  Equation  (6)  represents  the  autoregressive 
moving  average  (ARMA).  The  actual  hypothesis  testing  is  then 

(T<^^  - T^°V  A'*q  > A,  Aq>0  (7) 

implies  that  is  rejected  and  the  threshold,  Aq,  is  selected  in  the  usual  manner.  In 
Eq.  (7) 


H- :T  = T^®^ 

for  all  t e T. 

0 

1 

Hj  :T  = T^^^ 

for  all  t £ T. 

I 

q*=  (91.92'  ■ 

i i 

t‘=  (Tj,  T^. 

and  A = E[ee^] 
i 

The  problem  of  inverting  A may  be  solved  by  the  numerous  available  computer -oriented 
techniques. 
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ON  THE  PARTITIONING  REPRESENTATION  FOR  FINITE  SETS  AND  ITS  APPLICATION 
T.  Lee  and  E.J.  Smith 

A set  of  rules  for  representing  the  partitions  on  a finite  set  is  developed  and  al- 
gorithms for  computing  the  least  upper  and  greatest  lower  bounds  are  given.  Several 
applications  including  methods  for  determining  the  set  of  partitions  having  the  substi- 
tution property,  the  set  of  partition  pairs,  and  the  minimal  reduced  machine  for  a given 
finite -state  sequential  machine  are  illustrated.  The  approach  is  also  extended  to  find 
the  subsemigroups  of  a given  semigroup. 

The  concepts  of  equivalence  relation  and  partitions  are  widely  used  in  dealing 

. . 7 

with  sequential  machines  and  yield  fruitful  results  in  decomposition  theory;  however, 
the  computation  procedure  is  complicated  and  troublesome,  especially  in  the  case  of 
incompletely  specified  machines.  The  purpose  of  the  present  work  is  to  develop  a gen- 
eral approach  for  computing  the  desired  partitions  with  the  aid  of  a digital  computer. 

1 2 

In  the  following  summary,  proofs  of  theorems  and  lemmas  are  omitted  for  brevity. 

Hutchinson^ has  given  a set  of  rules  for  representing  the  partitions  on  a finite  set 
of  n elements  by  an  n-tuple  integer  array  and  an  algorithm  for  generating  these  parti- 

2 

tions;  a "loopless"  algorithm  for  generating  the  partitions  is  also  developed  by  Ehrlich.  ’ 
We  will  define  an  alternate  set  of  rules  for  the  representation  of  partitions  which  pre- 
serves the  partial  ordering.  The  conversion  between  the  two  representations  is 
straightforward . 

A partition  it  on  a finite  set  X is  a collection  of  mutually  disjoint  non-empty  sub- 
set of  X whose  union  is  X.  The  members  of  ir  are  called  blocks  or  equivalence  classes 
and  the  existence  of  two  elements  a,b  of  X in  the  same  block  of  ir  is  denoted  by  a = b(ir). 
The  n elements  of  a given  finite  set  are  numbered  1,2,  . . . ,n.  Without  loss  of  generality, 
let  X be  the  set  of  integers  {l,  2,  . . . ,n}  and  let  the  set  of  all  partitions  onX  be  II(X). 

The  representation  of  a partition  ir  = ^ n-tuple  integer 

array  (a, , a_ a ) such  that  for  ieB,,  a.  = min  B.,  where  min  B,  is  the  smallest 

element  in  B,  . 

k 

Suppose  n = 7,  then  the  partition 
TT  = {1,3,6  , 2,  5 , 4,  7 3 

is  represented  by  the  7-tuple  array  (1,  2,  1,  4,  2,  1,  4).  We  may  consider  the  n-tuple 
(a^,a2f  . . . ,a^)  as  an  ordered  set  or  image  corresponding  to  a mapping  from  the  set  X 

into  itself,  f:  X X,  such  that  (a  ^ , a^ a^)  = (f(  1),  f{2),  . . . , f(x)).  The  question  arises 

as  to  whether  the  mapping  f:  X X is  a valid  representation  of  the  partition  on  the  set 
X.  A mapping  f;  X -»  X represents  a partition  on  X satisfying  the  following  criteria: 
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(1)  f(i)<i  for  (contraction) 

(2)  £•  f(i)  = £(i)  for  1 < i £ n (idempotency) 

Where  • denotes  the  functional  composition,  i.e.,  f-  g(i)  = f(g(i)). 

Let  F(X)  be,  the  set  of  all  mappings  from  X into  itself  which  satisfy  the  criteria 
of  contraction  (1)  and  idempotency  (2). 

Definition:  Let  <t>  : F(X)  -*  n(X)  for  £e  F(X)  and  4>(f)  = it  e TUX)  iff 

f(i)  = f(j)  i = j (ir),  V i.  j E X 

Lemma  1:  Given  n = e n(X)  and  f e F(X)  such  that  $ (f)  = ir , 

then  f(i)  = min  Bj^  for  1 £ i £ n and  i c Bj^. 

Lemma  2:  $ : F(X)  ->  n(X)  is  one-to-one  and  onto. 

Definition:  For  F(X),  if  fj(i)  = fj(j)  implies  f^li)  = £2(3)  . V ii  je  X,  then 

fj  < f^.  The  set  F(X)  with  the  partial  ordering  £ is  a partially  ordered  set  (POS). 

Theorem  1:  Given  f^ , f^  e F(X)  then  f ^ iff  £2  • ~ ^2‘ 

Corollary  1:  If  £j  £ f^.  then  £ f ^(i)  for  all  1 £ i £ n. 

Definition:  A mapping  0:  L M from  a POS  L to  a POS  M is  called  an  order- 
preserving mapping  iff,  v b e L,  a < b implies  0(a)  £ 0(b). 

Lemma  3:  The  mapping  4>  : F(X)  n(X)  is  an  order  - preserving  mapping. 

The  least  upper  bound  of  f , £2  £ F(X)  is  f = lub  (f  ^ , £2)  = f j ^ ^2  that  f £ t j , 

f > £2*  and  for  any  g e F(X),  g > f j , g > ^2  8 > definition  for  greatest 

lower  bound  £ = glb(fj,f2)  “ ^ ^2  similar.  The  following  two  theorems  show  the 

existence  of  lub  and  gib,  and  lead  to  the  algorithms  for  computing  them. 

Theorem  2:  For  £j,f2£  F(X)  let 

ho(i)  = £2  • fi(i) 


hj(i)  = hg  • £2(1) 
h2(i)  = hj  • fj(i) 


h,  (i)  = h,  , • £ (i)  for  all  1 < i < n 
k k-1  s — — 


where 


/ 1 if  k is  even 
2 if  k is  odd 


I 


I 

1 
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and  k is  the  smallest  integer  such  that  hj^=  then  f = h^^  = lub(fj,f2). 

Algorithm:  lub 

For  F(X)  and  f = lub(fj,f2),  let  A(i)  = fj(i),  B(i)  = £2(1)  and  X(i)  = F(i)  for 

1 < i < n;  the  algorithm  follows. 

1.  Read  A(i),  B(i)  for  1 £i  £n. 

2.  Set  X(i)  = A(i)  and  Y(i)  = B(i)  for  1 < i £n. 

3.  Set  k = - 1 . 

4.  If  X(i)  = Y(i)  for  1 £i  £n,  then  go  to  step  6;  otherwise  set  k = k+  1. 

5.  If  k is  even  then  set  X(i)  = Y(A(i))  for  1 < i £ n and  go  to  step  4;  otherwise  set 
Y(i)  = X(B(i))  for  1 £ i £ n and  go  to  step  4. 

6.  Record  X(i)  for  1 < i < r'. 

Theorem  3:  For  f^.f^e  F(X).  Let  f(l)  = 1 and  for  i > 2,  define  £(i)  recursively, 
f(k)  if  £j(i)  = f j(k)  and  f2(^)  = 1 £ k < i. 

f(i)  = { 

i otherwise 
then  £ = gib  (f  j,  £2). 

Algorithm:  gib 

1 . Read  A(i),  B(i)  for  1 £ i £ n. 

2.  Set  i = 2 and  X(l)  = 1. 

3.  Set  k = 1. 

4.  If  A(i)  = A(k)  and  B(i)  = B(k),  then  set  x(i)  = x(k),  go  to  step  6;  otherwise  set 
k = k+  1. 

5.  If  k = i,  then  set  x(i)  = i and  go  to  step  6;  otherwise  go  to  step  4. 

6.  Set  i = i + 1 . 

7 . If  i < n then  go  to  step  3;  otherwise  record  x(i)  for  1 £ i £ n. 

The  triple  {F(X),  V,  A } is  a lattice  and,  moreover,  the  mapping  $ : F(X)  -»  I1(X)  is  a 
4 

morphism  of  lattices. 

Theorem  4:  For  ^ = lub(fj,f2)  = fiVf2,  g = glb(fj,f2)  = fj  A £2. 

then  $(£)  = $(fj)V  *(£2)  and  (g)  = 4>(£j)  A $(£2)- 

A.  Applications 

Partitions  with  the  substitution  property  are  essential  tools  in  the  study  of  sequential 
machines.  In  this  section  we  will  discuss  several  examples  of  algorithms  for  generating 
partitions  having  various  properties.  The  approach  used  in  obtaining  these  algorithms 
is  unified  so  that  the  results  can  be  extended  to  that  of  finding  the  quotient  structures 
of  a given  semigroup  as  well  as  of  other  finite  algebraic  system. 
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A machine  (Mealy)  is  a 5-tuple 
[S.  X.  Z,  6,  \] 

where 

(1)  S = {l , 2,  . . . , n]  is  the  set  of  states  of  size  n 

(2)  X=  [l,  2,  . . . , p}  is  the  set  of  input  alphabets  of  size  p 

(3)  Z = [1,2,..,,  m)  is  the  set  of  output  alphabets  of  size  m 

(4)  6 : SxX  ->  S is  the  next  state  function 

(5)  \;SxX-»Zis  the  output  function. 

We  now  consider  the  case  of  deterministic  machines.  X(i),  Y(i)  for  1 < i < n are  n-tuple 
variable  arrays,  NX(i,  k),  Z(i,  k)  for  1 < k < p are  two-dimensional  arrays , 

and  A(i,j)  for  1 < i < n,  1 < j < n,  B(i,j,k),  C(i,j,k)  for  1 < i < n,  1 < j < n,  1 < k < m 
are  Boolean  arrays. 

1.  The  Algorithm  for  Generating  All  Partitions  with  the  Substitution  Property 

Defini tion . A partition  tr  on  the  set  of  states  of  machine  M is  said  to  have  the  sub- 
stitution property  iff,  V ke  X,  v i#  j e S, 

(i  = j(ir))D  (6(i.k)  = 6(j,k)(ir)). 

Suppose  f e F(x)  and  4i(f)  = ir , then  we  say  f has  the  substitution  property  iff 

(Vke  X)(Vi,  je  S),  (f{i)=f(j))3  (f(6(i,  k))  = f(6(j,  k))  (1) 

We  may  call  Eq.  (1)  the  logic  equation,  that  is  to  say  any  f e F(S)  has  the  substitution 
property  if  and  only  if  it  satisfies  the  logic  Equation  (1).  Let  X(i)  = f(i).  Vie  S. 

NX(i,k)  = X(6(i,k)),  Vis  S,  ktX. 

. 1 if  X(i)  = X(j) 

A(i.j)  = { 

0 othv'irwise,  V ii  jeS 

f 1 if  NX(i,k)  = NX(j,k) 

B(i,  j,  k)  = I 

0 otherwise,  Vi.  j e S,  ktX 
The  logic  Eq.  (1)  can  be  written  as 

(vkE  X)(vi.  jE  S),  (A(i,j)  3 B(i,j,k))  (2) 

Since  a][)b  = I a V b,  logic  Eq.  (2)  can  be  expressed  as 

(VkcX)(Vi,  jeS),  (“1  A(i,j)VB(i,j,k))  (3) 

Equations  (1),  (2)  and  (3)  are  all  equivalent  and  can  be  written  in  a high-level  computer 
language  such  as  PL/l  or  Algol  60.  Figure  1 shows  the  flow  chart  for  generating  all 
partitions  with  the  substitution  property. 
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2.  Algorithm  for  Generating  All  State -to-State  Partition  Pairs 
Definition:^  (ir , T ) is  a State -to-State  (SS)  partition  pair  iff,  V i t S,  VkeX, 

(i  ^ j(ir))  D (6(i.k)  ^ 6(j,k)(T)) 

For  f,  gt  F(S),  the  partition  representations  of  ir  and  t,  respectively,  let  X(i)  = f(i), 

Y(i)  = g(i)  and  NX(i,k)  = Y(6(i,k)),  V it  S,  Vke  X;  then  f,g  represents  an  SS  pair  iff 

(VkeX)(Vi,  jeS),  (X(i)  = X(j))  3 (NX(i,  k)  = NX(j,  k))  (4) 

The  only  difference  between  logic  Eqs.  (4)  and  (2)  in  example  1 is  the  array  NX(*,*) 
defined  by  another  partition  representation  g.  Figure  2 shows  the  flow  chart  for  gen- 
erating all  SS  partition  pairs. 

3.  Minimal  Reduced  Sequential  Machine 

The  machine  equivalent  to  machine  M having  a minimal  number  of  states  is  called 
the  minimal  reduced  machine  of  M. 

Definition:  Partition  ir  on  the  set  of  states  is  output-consistent  iff  Vke  X, 

Vi,  jeS,  (i  ; j(Tr))  2)  (Mil  k)  D X(j,  k)) 

If  partition  is  the  maximal  output-consistent  partition  with  the  substitution 
property,  then  the  quotient  machine  M/ti-  is  the  minimal  reduced  machine.  In  general, 
we  have  to  solve  the  problem  for  incompletely  specified  machines;  therefore  we  have 
to  extend  the  definition  of  output-consistent  partition  with  substitution  property.  For 
the  problem  of  concern  in  this  report,  we  simply  treat  the  unspecified  states  or  output 
as  don't-care  conditions  in  the  sense  that  the  statement  6{i,k)  = 6(j>k)(Tr)  is  always  true 
for  any  tr  if  either  6{i,k)  or  6(j,k)  is  unspecified;  similarly  the  statement  X(i,k)  DX(j,k) 
is  always  true  if  either  \(i,k)  or  X(j,k)  is  unspecified.  Then  we  have  the  extended  de- 
finition : 

Partition  it  is  output-consistent  with  the  substitution  property  iff  Vkt  X,  Vi,  jc  S, 
(i  = j(7T ))  D ((6(i,  k)  = 6(j,  k)(ir))  V(6(i,  k)  unspecified)  V (6(j,  k)  unspecified)),  and 
(i  " j(it))  3 (X,(i,k)  ^ X(j,k))  V(X(i,k)  unspecified)  V(X(j,k)  unspecified)). 

For  fc  F(S),  the  partition  representation  of  it,  let  X(i)  ^i(i)  for  Vic  S. 


NX(i,k)  = { 


X(6(i,k))  if  6(i,k)  specified 

n + 1 if  6(i,  k)  unspecified  (note  that  n+1  i S),  VicS,  VkcX 


Z(j.j)  = { 


X(j,k)  if  X(j,k)  specified 

m + 1 if  X(j,  k)  unspecified  (note  that  m + 1 / Z),  VjcS,  VkeX 


V 
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, 1 if  X(i)  = X(j) 

A(i,j)  = I 

0 otherwise,  Vi,  jtS  ' 

f 1 if  (NX(i,k)  = NX(j,k))  V(NX(i,k)  = n+  1)  V(NX(j,k)  = n+  1) 

B(i,  j,k)  = I 

0 otherwise,  Vi,  JeS,  vke  Z 

f 1 if  (Z(i,k)  = Z(j,k)  V(Z(i,k)=  ni+  1)  V(Z(j,k)  = m+  1) 

C(i,  j,k)  = I 

0 otherwise,  V i,  jcS,  vke  Z 

It  follows  that  f is  output-consistent  with  the  substitution  property  iff 

(VkeX)(Vi,  jcS),  (A(i,j)  3 (B(i,j,k)  AC(i,j,k)))  (5) 

The  flow  chart  is  the  same  as  the  one  in  Fig.  1,  except  that  the  logic  equation  is  Equa- 
tion (5).  For  machine  M completely  specified,  we  simply  remove  the  don't-care  con- 
dition in  the  definition.  Furthermore,  since  the  output-consistent  partition  is  always 
smaller  than  or  equal  to  the  1 -equivalent  partition  P^  (see  Ref.  8,  p.  287),  the  initial 
partition  (the  universal  partition)  in  the  flow  chart  can  be  replaced  by  P^,  which  is  ob- 
tained from  the  state  table  of  completely  specified  machine  M. 

4.  Congruences  of  Semigroup 

A finite  semigroup  {G,p]  is  a set  G = {l,  2,  . . . ,n}  with  an  associative  binary 
composition  p:GxG  G.  In  the  following  discussion  we  will  use  the  terms  eqviivalence 
relation  and  partition  interchangeably. 

Definition:  An  equivalence  relation  R(L)  over  a semigroup  G is  called  a right 
(left)  congruence  if  aRb  implies  p(a,x)  Rp(b,x)  (aLb  implies  p(x,a)  Lp(x,b)  respective- 
ly, Vxe  G.  An  eqmvalence  relation  E over  G which  is  both  a right  and  a left  congru- 
ence is  called  a congruence  over  G;  E is  a congruence  over  G,  iff  aEb,  cEd  implies 

p(a,  c)  Ep(b,  d).  The  factor  structure  G/E  is  called  the  quotient  semigroup  of  the  semi- 

,-5,6 
group  G. 

Let  F(G)  be  the  representation  for  the  set  of  partitions  on  G.  For  fe  F(G),  f is  a 
right  congruence  iff,  Vi,j,kcG, 

(f(i)  = f(j))  D (f(p(i.k))  = f(p(j,k)))  (6) 

Similarly,  f is  a left  congruence  iff,  Vi,  j,  kcG, 

(f(i)  = f(j))  D (£(p(k,i))  = f(p(k,  j)))  (7) 

Consequently,  f is  a congruence  iff,  Vi,  j,  k,  m e G, 


((f(i)  = f(j))  A(f(k)  = f(m)))  3 (f(p(i,k))  = f(p(j,m))) 


(8) 
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, 1 if  f(i)  = £(j) 

A(i,j)  = j 

0 otherwise,  V i,  j e G 

then  the  logic  Eqs.  (6),  (7)  and  (8)  can  be  simply  expressed  by 

(V i,  j,  ke  G),A(i,j)  ^ A(p(i,k),  p(j,k))  (9) 

(Vi,  j,  ke  G),A(i,j)  ^ A(p(k,i),  p(k,j))  (10) 

(V  i,  j,  k,  m e G),(A(i,  j)  A A(k,m))  3 A(p(i,k),  p(j,m))  (11) 

Eqs.  (9),  (10)  and  (11)  respectively. 

The  flow  chart  for  generating  congruences  is  straightforward. 

We  have  achieved  the  objective:  a unified  and  systematic  approach  to  construct 
various  algorithms  for  computing  the  quotient  structures  of  finite  automata  as  well  as 
other  algebraic  structures,  such  that  the  algorithms  can  be  easily  written  into  any 
popular  high-level  computer  language.  In  the  following,  the  procedures  for  computing 
the  SP  partitions  for  a given  machine  is  illustrated.  Other  examples  are  given  in 
Reference  12.  Further  research  might  include  algorithms  for  computing  the  quotient 
structures  of  non-deterministic  automata  and  other  related  disciplines  such  as  transi- 
tion graphs. 

PL/I  Program  to  Compute  the  SP  Partition  of  a Given  Machine 
The  state  table  of  a given  machine  M is  as  follows: 


PS 

*1 

*2 

NS 

*3 

*4 

^5 

1 

2 

1 

5 

8 

3 

2 

I 

2 

6 

7 

4 

3 

4 

3 

6 

6 

1 

4 

3 

4 

5 

5 

2 

5 

5 

6 

3 

4 

7 

6 

6 

5 

4 

3 

8 

7 

7 

8 

4 

2 

5 

8 

8 

7 

3 

1 

6 

Machine  M 
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The  partitions  with  substitution  property  and  their  lattice  are: 


PASP:  PROC  OPTIONS  (MAIN); 

DCL  (X(8),NX(5,8))  FIXED  DEC  (1,0); 

PUT  LIST  ('THESE  ARE  PARTITIONS  WITH  SUBSTITUTION  PROPERTY 
OF  THE  GIVEN  MACHINE  M'); 

X(l)=l; 

DO  X(2)=l  TO  2; 

DO  X(3)=l  TO  3; 

DO  X(4)=l  TO  4 WHILE  (X(X(3))=X(3)); 

DO  X(5)=l  TO  5 WHILE  (X(X(4))=X(4)); 

DO  X(6)=l  TO  6 WHILE  (X(X(5))=X(5)); 

DO  X(7)=l  TO  7 WHILE  (X(X(6))=X(6)); 

GET:  DO  X(8)=l  TO  8 WHILE  (X(X(7))=X(7)); 

P=X(8); 

IF  X(P)'=X(8)  THEN  GOTO  GIVUP; 

/*  GENERATE  A NEW  PARTITION  REPRESENTATION  X(I)  WHICH  SATISFIES 
THE  CONTRACTION  AND  IDEMPOTENT  CRITERIA  */ 

NX(1,  !)=X(2);  NX(1,2)=X(1); 

NX(1,3)=X(4);  NX(1,4)=X(3); 

NX(1,5)=X(5);  NX(1,6)=X(6); 

NX(1,7)=X(7);  NX(1,8)=X(8); 

NX(2,  l)=X(i);  NX(2,2)=X(2); 

NX(2,3)=X(3);  NX(2, 4)=X(4); 

NX(2,  5)=X(6);  NX(2,  6)=X(5); 

NX(2,7)=X(8);  NX(2, 8)=X(7); 

NX(3,  1)=X(5);  NX(3,2)=X(6); 

NX(3,3)=X(6);  NX(3, 4)=X(5); 

NX(3,5)=X(3);  NX(3, 6)=X(4); 

NX(3,7)=X(4);  NX(3, 8)=X(3); 

NX(4,  1)=X(8);  NX(4,2)=X(7); 

NX(4,  3)=X(6);  NX(4,  4)=X(5); 

NX(4,5)=X(4);  NX(4, 6)=X(3); 

NX(4,7)=X(2);  NX(4, 8)=X(1); 

NX(5,  1)=X(3);  NX(5,2)=X(4); 

NX(5,3)=X(1);  NX(5,4)=X(2); 

NX(5,5)=X(7);  NX(5, 6)=X(8); 

NX(5,7)=X(5);  NX(5, 8)=X(6); 
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/*  NX(I,  J)  IS  THE  NEXT  STATE  OF  STATE  J 
UNDER  THE  INPUT  I */ 

TEST:  DO  I=l  TO  7; 

DO  J=1  TO  8; 

DO  K=1  TO  5; 

IF  *('(X(1)=X(J))!(NX(K,  1)=NX(K,J)))  THEN  GO  TO  GIVUP; 
END;  END; 

END  TEST; 

/♦  TEST  THE  SUBSTITUTION  PROPERTY  OF  THE  PARTITION  */ 

PUT  SKIP  (3); 

DO  1=1  TO  8; 

PUT  EDII  ('X(M,')='.X(I))(A(2).F(l),A(2),F(l)); 

PUT  EDII  ('  ')(A(3»; 

END; 

GIVUP:  END  GET; 

END;  END;  END;  END;  END;  END; 

END  PASP; 

THESE  ARE  PARTITIONS  WITH  SUBSTITUTION  PROPERTY  OF  THE  GIVEN 
MACHINE  M 


X(1)=1 

X(2)=l 

X(3)=l 

X(4)=l 

X(5)=l 

X(6)=l 

X(7)=l 

X(8)=l 

X(1)=1 

X(2)=l 

X(3)=l 

X(4)=l 

X(5)=5 

X(6)=5 

X(7)=5 

X(8)=5 

X(1)=1 

X(2)=l 

X(3)=3 

X(4)=3 

X(5)=3 

X(6)=3 

X(7)=l 

X(8)=l 

X(1)=1 

X(2)=l 

X(3)=3 

X(4)=3 

X{5)=5 

X(6)=5 

X(7)=7 

X(8)=7 

X(l)=l 

X(2)=2 

X(3)=2 

X{4)=1 

X{5)=5 

X(6)=6 

X(7)=6 

X(8)=5 

X(1)=1 

X(2)=2 

X(3)=3 

X(4)=4 

X(5)=5 

X(6)=6 

X(7)=7 

X(8)=8 
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A SEMI-AUTOMATIC  DESIGN  VERIFICATION  SYSTEM  FOR  COMMUNICATIONS 
LINE  CONTROL  PROCEDURES 

E.J.  Lancevich  and  E.  J . Smith 

The  present  report  illustrates  the  use  of  a communications  protocol  design  lan- 
guage (CPDL)  as  a powerful  tool  in  a semi-automatic  design -verification  system.  The 
application  of  CPDL  to  the  description  of  line -control  procedures  is  described  in  another 
section  of  the  present  report.^ 

A.  Design  Automation  of  Line  Control  Procedures 

In  the  following  sections,  some  techniques  will  be  described  that  may  be  used  in 
a semi -automated  design  environment  where  the  final  goal  is  the  implementation  of  line 
control  procedures.  We  choose  to  use  the  word  implementation  rather  than  hardware, 
software  or  firmware  realization,  since  the  state-of-the-art  is  so  rapidly  changing  in 
the  field.  There  are  many  considerations  that  must  be  taken  into  account  in  an  overall 
design  automation  environment,  such  as  the  ultimate  output  (what  is  to  be  obtained),  the 
cost  of  developing  the  system,  etc.  Our  focus  here  is  on  verifying  the  design  of  line 
control  procedures,  that  is  procedures  that  check  for  "correctness"  of  the  line  controls. 

Consider  the  block  diagram  shown  in  Figure  1.  In  this  diagram,  the  CPDL  de- 
scription (program)  is  fed  to  a computer,  indicated  by  the  dotted  enclosure  which 


CPDL 

PROGRAM 

r ~ 
1 

— 

1 

1 

1 

1 

1 

1 

SYNTAX 

CHECKING 

INTERNAL 

DESCRIPTION 

1 COMPILER 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

DESIGN 

VERIFICATION 

PROGRAMS 

1 

1 

1 

1 

^ • L_ 

1 

1 

1 

^ESULTS^^„^ 

Fig.  1.  Elements  of  a design  verification  system 
using  CPDL  as  a source  language. 
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contains  a syntax  checker  and  internal  description  generation  block.  The  syntax  checker 
is  simply  used  to  check  the  structural  accuracy  of  the  code  while  the  internal  descrip- 
tion generation  block  is  used  to  generate  a set  of  data,  describing  CPDL  in  a way  which 
can  be  used  as  input  to  the  design  verification  block.  In  a broad  sense,  the  syntax 
checker  and  internal  description  block  is  a compiler  for  the  CPDL  language,  where  the 
output  is  not  executable  software,  but  data  useful  in  design  verification.  In  the  follow-  j 

ing  sections  we  develop  some  criteria  which  all  correct  CPDL  programs  must  satisfy  i 

and  we  give  algorithms  for  the  checking  of  these  criteria. 

B.  Verification  of  Line  Control  Procedures 

All  line  control  procedures  must  possess  certain  properties  in  order  to  be  capable 
of  execution  within  some  implementation.  Such  properties  are  presented  in  the  follow- 
ing subsections  along  with  the  algorithms  for  their  detection. 

1.  Output  Connectivity 

The  property  of  output  connectivity  is  that  for  any  state  there  must  exist  at  least 
one  state  transition  for  which  that  state  is  the  source  and  in  which  it  is  not  the  destina- 
tion. This  essentially  says  that,  if  viewed  in  terms  of  a state  diagram,  a state  must 
be  linked  to  at  least  one  state  other  than  itself.  This  situation  is  illustrated  in  Figure 

■i 

2.  The  condition  is  consistent  with  the  operational  or  dynamic  environment  of  the  j 


1 ^ J 

OUTPUT  CONNECTIVITY 
STATE  i 

INPUT  CONNECTIVITY 
STATE  j 

Fig.  2.  Graphical  interpretation  of 
line  control  procedure 
connectivity  properties, 

system,  since  the  state  only  exists  as  a momentary  marker  of  the  time  sequence  of 
signals  that  are  driving  the  control;  hence,  each  state  must  have  another  state  as  its 
destination.  From  these  remarks  it  is  seen  that  the  same  situation  applies  to  all  states 
in  the  procedxire;  so  essentially  the  line  control  procedure  is  a closed  system.  This 
property  is  extremely  easy  to  detect.  The  basic  algorithm  determines  whether  the 


system  possesses  the  property  by  forming  a boolean  connectivity  matrix  C,  with  ele- 
ments c^.,  where  c^.=  1 when  element  i is  connected  to  element  j and  c_=  0 otherwise. 
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This  is  most  easily  achieved  by  assigning  a nvimber  to  each  state  in  the  symbol  table 
of  the  compiler  in  a first  pass,  and  in  the  second  pass,  using  the  number  assigned  to 
make  an  entry  in  the  matrix  when  the  state  transition  field  of  the  CPDL  statement  is 
rescanned.  For  example,  if  the  state  transition  field  of  the  statement  contained  the 
entry  INIT,RCV,  on  the  first  pass  the  compiler  creates  a unique  number  for  the  symbol 
INIT,  say  2 and  for  RCV,  say  7 and  the  element  is  then  set  equal  to  one.  Having 
the  C matrix,  the  following  semi -formal  algorithm  checks  for  the  connectivity,  where 
N is  the  number  of  distinct  states  in  the  line  control  procedure. 


Algorithm 

Input:  Connectivity  matrix  C,  size  N by  N 

Output:  Marked  states  indicating  which  states  are  Output  Connected 


1.  ^ i=  1 to  N 

2.  ^ j=l  ^ i-1,  j = i+2  ^ N 

if  c..=  l then  begin  mark  element  "i" 

- ^ goto  2 

end 


3.  stop 


The  underlined  words  are  keywords  in  the  semi-formal  programming  language.  The 
outer  do  loop  controls  the  stepping  through  the  rows  of  the  matrix,  while  the  inner  ^ 
loop  marks  element  i if  it  is  connected  to  another  element,  not  itself.  The  end  closes 
the  do  statements  and  the  begin  statement.  Note  that  connectivity  is  just  a necessary 
condition  for  the  procedure  described  by  CPDL.  It  clearly  does  not  guarantee  or  say 
anything  about  whether  the  line  control  performs  correctly. 


2.  Input  Connectivity 

Just  as  the  connection  out  of  a state  is  necessary,  so  is  input  connectivity;  that 
is,  for  state  i,  there  exists  a state  j different  from  i from  which  a transition  must 
occur.  This  is  apparent  from  the  preceding  remarks  and  additionally  from  the  fact 
that  any  state  must  be  attainable  from  the  idle  state  defined  in  the  program.  The  con- 
dition is  tested  by  forming  the  transpose  of  the  connectivity  matrix  C defined  in 
Section  1 and  then  applying  the  same  test.  In  this  case,  the  marked  output  will  be  the 
set  of  unreachable  states.  These  algorithms  may  be  programmed  and  used  as  part  of 
the  design  verification  system.  For  designs  involving  many  states,  the  ones  of  most 
interest,  errors  such  as  the  ones  detected  by  the  two  described  algorithms  are  bound 
to  occur,  so  that  even  though  the  tests  appear  to  be  rather  simple,  they  are  essential. 
The  tests  for  the  connectivity  properties  are  performed  on  the  static  description  of  the 
line  control  procedure,  are  straightforward,and  moreover,  lead  to  the  conclusion  from 
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the  switching  theoretic  point  of  view  that  the  graph  is  strongly  connected;  that  is,  any 
state  may  be  reached  from  any  other  state  under  some  sequence  of  inputs.  While  we 
have  no  immediate  use  for  this  property,  it  may  be  of  use  in  future  investigations. 

3.  Completeness 

The  property  of  completeness  is  another  property  that  all  line  control  procedures 
must  possess.  Basically,  we  require  that  all  signals  in  the  same  set  be  accounted  for 
when  any  signal  from  that  set  causes  a transition  out  of  a state.  For  example,  if  the 
control  is  in  some  state  "A"  and  a transition  is  caused  by  the  arrival  of  the  character 
"SOH,  " where  "SOH"  is  an  element  of  a signal  set  with  four  elements,  the  three  other 
elements  of  the  set  must  appear  on  some  transition  out  of  state  "A.  " The  reason  be- 
hind this  is  that  we  tacitly  assume  that  an  error  may  occur  on  the  transmission  line 
which  transforms  one  element  of  the  set  into  another  and  goes  undetected  in  the  signal 
processor.  In  a deterministic  design,  there  must  be  a defined  action  for  these  errors 
and  thus  checking  for  the  occurrence  of  this  phenomenon  is  a valid  design  verification 
function.  In  the  case  of  feedback  signals  from  the  multiplexer,  we  assume  reliable 
signalling;  thus  for  these  signals,  no  such  checking  is  necessary.  In  checking  for  this 
property,  the  algorithm  is  more  suitably  placed  within  the  context  of  the  block  labelled 
compiler,  (the  dotted  enclosure  in  Figure  1).  The  general  idea  of  the  algorithm  is  to 
evaluate  the  signal  expressions  as  they  are  parsed  (by  the  syntax  checker)  for  each 
current  state  of  the  control.  TWs  evaluation  results  in  a set  of  signals.  A vector  for 
each  signal  set  defined  within  the  line  control  procedure,  i.  e.  , appearing  in  the  defini - 
tions  section  of  the  CPDL  program,  is  created  and  initially  set  to  zero.  At  each  state- 
ment associated  with  the  same  state  in  the  program,  the  result  of  the  signal  expression 
evaluation  is  used  to  mark  the  signals  that  appeared  for  the  particular  statement.  After 
passing  over  the  entire  input  program  for  the  particular  state,  the  vectors  are  checked. 
If  the  signal  set  vector  has  all  ones  or  all  zeroes,  the  signal  set  has  been  accounted 
for  (all  ones)  or  was  not  present  in  the  transition  (all  zeroes).  If  neither  of  these  con- 
ditions is  met,  then  in  those  vectors  that  contain  the  combinations  of  zeroes  and  ones, 
the  signals  corresponding  to  the  zeroes  have  not  been  accounted  for.  The  checking 
procedure  is  repeated  for  all  states. 

The  algorithm  is  illustrated  by  an  example,  instead  of  a more  formal  step-by- 
step  procedure. 

The  state  transition  graph  for  the  example  is  shown  in  Fig.  3,  along  with  the  cor- 
responding CPDL  program.  In  the  interest  of  clarity,  no  processing  functions  are 
included.  The  only  signal  set  defined  is  the  set  of  integers  0,  1,2,3.  A four -element 
vector  is  set  initially  to  zero.  When  the  program  implementing  the  check  algorithm 
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CPOL 

PROGRAM 

DEFINITIONS 
SIGNAL  (0.1. 2. 3) 
START  -Al 
CONTROL 

(Oll|2l/  AI.A2/.. 
3/  AI.A3/.. 
2/  A2.A3/.. 
Oil)/ A2.AI/.. 
2|3)/  A3.AI/ 
Oil)/  A3.A2/.. 


Fig,  3.  Example  for  completeness 
dieclc  algorithm. 

reads  in  the  first  line  of  the  control  section,  an  association  is  made  between  the  vector 
and  the  state  appearing  on  the  left-hand  side  of  the  state  transition  field.  In  the  first 
line  (or  statement),  the  signal  expression  evaluates  to  the  set  0,  1,2,  so  those  elements 
of  the  vector  corresponding  to  those  signals  are  set  to  one;  for  example,  the  vector 
might  contain,  after  the  operation  has  been  performed,  1,  1,  1,0.  The  program  reads 
the  next  input  statement.  Since  the  state  is  Al,  the  signal  expression  is  evaluated  and 
the  new  signal,  3,  is  noted  by  entering  a one  in  the  last  vector  element.  In  each  of  the 
steps,  the  input  lines  of  the  program  are  marked  indicating  that  they  have  been  operated 
upon.  When  all  lines  have  been  marked,  the  check  algorithm  terminates. 

Since  the  next  four  statements  do  not  contain  the  state  Al  in  the  left  hand  side  of 
a state  transition,  they  are  skipped  over  by  the  check  program.  At  the  conclusion  of 
the  first  pass  over  the  input  program,  all  signals  are  accounted  for  at  Al. 

A second  iteration  of  the  procedure  is  then  started.  During  the  second  pass,  the 
signal  set  vector  is  set  to  zero  at  the  outset,  and  the  check  program  starts  to  read  the 
CPDL  program  again.  Since  the  first  two  statements  are  marked,  the  signal  set  vector 
is  now  associated  with  state  A2.  In  a manner  similar  to  that  described  above,  the  lines 
associated  with  A2  are  processed,  marked,  and  at  the  conclusion  of  the  pass,  the 
vector  contains  1,  1,  1,0  indicating  that  at  A2,  a transition  has  not  been  provided  for  the 
signal  3.  The  algorithm  terminates  after  processing  state  A3. 

It  should  further  be  noted  that  this  algoritlim  may  be  used  to  determine  whether 
the  program  contains  two  or  more  transitions  from  the  state  caused  by  the  same  input 
signal,  by  checking  at  the  step  where  the  vector  is  set.  If  the  bit  associated  with  the 
signal  is  already  a one,  the  signal  has  already  defined  a transition. 
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C.  Discussion 


The  work  reported  demonstrates  the  feasibility  of  a useful  design  automation  tech- 
nique which  uses  CPDL  as  an  input  language.  The  constraints  imposed  by  the  com- 
munications environment  on  the  line  control  procedure  have  led  to  the  inference  of  a 
basic  set  of  properties  which  each  line  control  procedure  must  satisfy  and  to  the 
effective  procedures  described  to  check  for  them  using  the  CPDL.  program  as  input  to 
the  algorithms. 
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CPDL  - A COMMUNICATIONS  PROTOCOL  DESIGN  LANGUAGE;  APPUCATION  TO 
FULL-DUPLEX  PROTOCOLS 

E.J.  Lancevich  and  E.J.  Smith 

A communications  protocol  design  language  (CPDL)  was  previously  developed  for 

1 2 

the  precise  definition  of  simplex-type  communication  protocols.  Additional  work  sum- 
marized in  the  present  report  has  shown  that  the  language  can  be  effectively  extended 
to  the  more  complex  case  of  full-duplex  protocols  as  well.  The  extended  BNF  description 
of  CPDL  is  given.  An  included  example  illustrates  both  the  CPDL  description  of  a full- 
duplex  protocol  and  its  corresponding  state-diagram  representation. 

A.  Full-Duplex  Protocols 

The  primary  distinguishing  factor  between  simplex,  half-duplex,  and  full-duplex 
line  controls  is  the  parallelism  that  exists  in  the  latter  case,  since  transmission  takes 
place  in  both  directions  on  the  line  at  the  same  time.  In  most  cases  of  interest,  how- 
ever, full-duplex  protocols  rely  on  sequence  numbering  in  each  direction  to  maintain 
synchronism  between  sending  and  receiving  processes  at  the  switch.  Sequence  numbers 
are  usually  generated  by  a modulo  k numbering  scheme,  in  which  k is  the  largest  num- 
ber of  messages  (assuming  equal  message  lengths)  that  may  be  propagating  in  both 
directions  on  the  communications  line.  As  an  example,  consider  a scheme  in  which  the 
blocks  1000  bits  long  are  transmitted  over  a channel  with  capacity  resulting  in  a one 
way  delay  of  1 second.  Then  the  equivalent  length  of  the  transmission  media  (round  trip) 
is  4800  bits.  Thus  five  messages  (total)  may  reside  in  the  channel  at  one  time,  so  that 
the  acknowledgement  signal  for  the  first  message  sent  by  one  end  arrives  at  the  other 
end  at  the  same  time  the  fifth  message  is  being  transmitted,  in  the  worst  case;  so  that 

3 

for  this  example  k=  5.  Gray  presents  an  excellent  survey  of  full-duplex  protocols  and 
gives  strong  arguments  favoring  the  use  of  independent  sequence  numbering  in  each 
direction. 

We  apply  our  proposed  model  to  just  such  a case.  For  simplicity,  assume  that 
messages  have  the  format  shown  in  Figure  1.  The  character  SOM  refers  to  the  start 
of  the  message.  This  is  followed  by  a fixed-length  indicator  transmit  sequence  number, 
a fixed-length  information  block  and  a fixed  length  receive  number.  Again  to  simplify, 
assume  there  are  no  checking  characters.  Assume  also  that  the  length  of  each  sequence 
ntimber  is  3 and  the  length  of  the  information  field  is  50. 

Both  ends  transmit  and  receive  simultaneously  with  two  independent  sequence 
numbers;  when  no  errors  occur,  the  transmit  sequence  number  received  at  one  end  of 
the  connection  will  be  placed  into  the  received  sequence  number  at  the  same  connection 
and  then  transmitted  back  to  the  original  send  (as  a positive  acknowledgement).  How- 
ever, if  the  original  message  is  garbled,  the  receiver  transmits  the  last  properly 
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SOM 

TRANSMIT 

SEQUENCE 

NO. 


FIXED 

LENGTH 

DATA 


RECEIVE 

SEQUENCE 

NO. 


Fig.  1.  Message  format  for  the 
full-duplex  example. 

received  sequence  number  and  destroys  the  message.  This  effectively  is  a negative 
acknowledgement  signal  and  causes  the  original  message  plus  the  next  set  of  messages 
(until  the  time  the  negative  acknowledgement  of  original  message  is  received)to  be  mis- 
interpreted. 

To  illustrate,  suppose  k=  5,  and  A sends  to  B.  Let  the  messages  be  labelled  M^, 
Mj,  M2»  Mj,  M^.  If  M^  is  received  in  error  by  B,  then  in  two  consecutive  transmis- 
sions, A receives  ones  in  the  corresponding  number  field.  If  A has  already  transmitted 
M^  and  M^,  they  will  be  received  in  error  at  B (out  of  sequence)  and  discarded  from 
the  system.  Thus  in  this  scheme,  M^,  M^  and  M^  must  be  retransmitted  in  order  to 
maintain  channel  synchronism. 

The  finite-state  graph  of  the  receiver  describing  this  scheme  is  shown  in  Fig.  2 
for  one  end  of  the  connection  and  the  sender  transition  graph  is  shown  in  Figure  3. 

The  receiver  responds  to  a start-of-message  signal  initially  and  proceeds  to  collect 
the  first  three  digits.  If  the  transmitted  sequence  number  is  equal  to  the  expected  se- 
quence number  then  the  expected  number  is  increased  by  one  mod  5 and  the  signal  vtc 
is  sent  via  the  cause  primitive  of  CPDL  (see  Figure  5). 

If  the  number  transmitted  is  not  equal  to  the  proper  number,  the  signal  invtc  causes 
a transition  to  the  error  state  (ERROR)  and  the  rest  of  the  transmission  is  ignored.  In 
this  diagram,  we  assume  that  the  signals  invtc  and  vtc  occur  before  the  next  significant 
transmission. 

From  the  error  state,  once  the  last  transmission  is  received,  operations  are 
executed  within  the  multiplexer  which  result  in  the  next  outbound  transmission  to  have 
placed  in  its  receive  field  the  sequence  number  of  the  last  valid  block  received.  The 
transmission  of  the  block  is  executed  the  sender  state  diagram  and  the  ready  signal 
generated  by  the  cause  primitive  at  the  receiver. 
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TSOM 

OIGIT(COMM)aCT>l/ 
deer  (CT.I) 

STORE  (BE) 

DIGIT  (COMM)  a CT>I/  . . 

Store  (BF);  if  the  TSN  is  the  expect- 
ed  TSN  cause  (vtc)  otherwise 
cause  (itc)  and  execute  operations 
in  the  multiplexer  to  send  the  next 
transmission.  The  next  transmis - 
sion  will  contain  the  last  valid 
vtc/SET(CT,54)  received  sequence  no.  cause  (ready). 

The  ready  signal  is  sent  to  the 
sender  control (see  Figure  3). 
any  (COMM)  &>  l/ 
store  (BF);  deer  (CT,  1) 


ANY(COMM  a CT/REFER  TO  TEXT 

FOR  THESE  OPERATIONS 

Fig.  2.  Receiver  state  graph  for  the 
full-duplex  example. 


READY/TRANSMIT  (BOUT) 

READY/CAUSE  READY) 


Fig.  3.  Sender  state  graph  for  the 
full-duplex  example. 

If  the  transmission  sequence  nvimber  is  invalid,  the  multiplexer  processing  pro- 
ceeds to  load  the  last  valid  received  sequence  number  into  the  next  outbound  message 
and  prompt  the  sending  procedure  for  the  line  with  a ready  signal.  If  the  sending 
process  is  idle,  it  transmits  the  next  outbound  message.  Otherwise,  it  prompts  the 
multiplexer  to  reenable  it  to  send  again  if  it  is  already  transmitting  another  message 
(in  effect  a waiting  loop  for  the  transmission).  This  is  depicted  in  Fig.  3,  the  sender 
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state  diagram.  When  the  signal  eos  appears  from  the  signal  processor,  the  sender  is 
again  enabled  to  retransmit. 

Finally,  if  the  transmission  sequence  nvimber  is  valid,  the  receiver  collects  the 
rest  of  the  transmitted  characters  and  then  enables  execution  of  the  following  operations 
in  the  multiplexer. 

If  the  received  sequence  number  is  equal  to  the  expected  sequence  number  then 
that  message  slot  in  the  list  of  messages  is  released  and  the  expected  sequence  number 
incremented  by  one.  (It  was  received  correctly,  so  the  copy  is  destroyed. ) If  the 
received  sequence  number  is  not  equal  to  the  expected  number,  then  the  next  message 
to  be  transmitted  is  the  one  with  the  expected  number,  if  this  is  the  first  time  that  the 
same  "last  valid  received"  number  matches  the  memory.  Otherwise  discard  it. 

To  further  clarify  this  last  statement,  suppose  that  messages  M^,  Mj,  M^,  M^, 
M^  are  sent  from  A and  M^  is  received  in  error  by  B,  and  the  next  message  to  be 
transmitted  is  Mq,  which  implies  M^  and  M^  have  already  been  sent.  From  B the  next 
three  return  messages  will  have  received  sequence  ntimbers  all  equal  to  1.  (M^ 

garbled,  M^  and  M^  rejected  because  out  of  sequence.)  The  first  one  received  should 
cause  M^  to  be  transmitted  next  and  the  next  two  should  be  ignored  (up  to  a maximum 
of  5 and  then  the  process  should  be  restarted),  since  these  are  the  result  of  M^  and 
M_^  being  ignored  at  the  receiver  due  to  the  garbling  of  M^.  This  completes  the  example 
which  indicates  that  the  model  is  generally  applicable  to  FDX  protocols  which  use 
double  sequence  numbering  for  synchronization.  The  (uncommented)  CPDL  descriptions 
are  shown  in  Figure  4. 

B.  Comments  on  the  Example 

It  should  be  noted  that  the  checking  of  sequence  numbering  which  apparently  is 
part  of  the  line  protocol  is  accomplished  as  a multiplexer  function.  We  contend  that  the 
sequence  nvunbering  is  not  really  part  of  the  line  protocol,  but  a device  for  maintaining 
local  synchronization  between  the  sending  procedure  and  the  receiving  procedure.  The 
sequence  numbering  merely  serves  as  a device  to  link  the  two  procedures. 

C.  CPDL-BNF  Description 

The  latest  version  of  CPDL  is  defined  by  the  BNF  description  shown  in  Figure  5. 
As  more  experience  is  gained  in  its  use,  it  will  be  modified  and  enhanced. 
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SENDER  PROCEDURE 

definitions 

signal  COMM  (SOM,  A,B,  • • • , Z.  0,  1,  • • • , 9) 

counter  CT 

bxiffer  BF 

start  = IDLER 

control 

SOM/IDLER , I DLER/nop; 

SOM/IDLER,  COL3/set  (CT,3);store  (BF); 

digit(COMM)*CT>  1/COL3,  COL3/store(BF);decr(CT.  1); 

digit(COMM)(S;CT=  1/ COL3,  CHTSN/ store(BF);  For  other  operations 

refer  to  Fig.  Z and 
Section  A of  text. 

vtc/CHTSN,  COL53/set(CT,  54); 

any(COMM)&CT>  1/COL53,  COL53/store(BF);decr(CT.  1); 

any(COMM)<SrCT=  1/COL53, IDLER/ store(BF);  Refer  to  Section  A. 

invtc/CHTSN,  ERROR/set(CT.  54); 

any ( COMM )&CT  1 /ERROR,  ERROR/nop; 

any(COMM)&CT=  l/ERROR,  ERROR/nop; 

RECEIVER  PROCEDURE 

definitions 

signal  COM  (ready,  eos) 

buffer  BOUT 

start  = IDLE 

ready/lDLE,  XMIT/tr  ansmit(BOUT); 
ready/XMIT,  XMIT/ cause(ready) ; 
eos/XMIT  ,IDL£/nop; 

Fig.  4.  CPDL  programs  for  sender  and  receiver 
for  full-duplex  example. 


»- 


396 


COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 


[ 


<PROGRAM>  ->  definitions  <DUST>  control  <SL1ST> 

<SUST>  <SLISTXSTMNT>  1<  STMNT> 

<STMNT>  ^ <EEXPR>  /<FST>  , <TST>  /<EXELT> 

<FST>  -♦  <IDENTIFIER> 

<TST>  -»  < IDENTIFIERX  IDLST> 

<IDLST>  ->  (<IDENTIFIER>  {,  <IDENTIFIER>  ]) 

<EXELT>  <FUNC>  ; < EXELT>  1<FUNC>  ; 

<DLIST>  -♦  <DUSTxDEFINITION>  | < DEFINITION> 

< DEFINITION>  ->  counter  <IDLST>  1 buffer  <IDLST> 

<EEXPR>  -»  <SETEXPRxCNTEXPR> 

<CNTEXPR>  -♦  <CNT><RLOPXINTEGER>  {&  <CNTEXPR>  } 

<CNT>  <IDENTIFIER> 

<INTEGER>  <INTEGERxDIGIT>  1<DIGIT> 

<RLOP>  -^  < 1<  1=1>  1>  1^ 

<SETNAME>  -♦  <IDENTIFIER>  ] ascii  | ebcdic  ] baudot  ] transcode  ] 4of8 
<IDENTIFIER>  -»  < LETTERx ALPHAMERlCxlDENTIFIER> 

<ALPHAMERIC>  ->  <LETTER>  1<DIGIT> 

<LETTER>  alblcldje] y|zlAlB|C VjZ 

<DIGIT>->  0lll2|3l4l5l6l7l8l9 

<SETEXPRESSION>  ->  <SETTRM>  <SETEXPRESSION>  ]<SETTRM> 
<SETTRM>  <SETFCT>  & <SETTRM>  )<SETFCT> 

<SETFCT>  SETEXPRESSION  ')'1<ELEMENTEXP>  j 

<UNARYOP>  <SETEXPRESSION> 

< UNARYOP>  -»  any  | not  | alpha  | digit 
<ELEMENTEXP>  ^ < ELN>  H < ELEMENTEXP>  ]<ELN> 

<ELN>  ^ <IDENTIFIER> 

< DEFINITION>  -»  signal  <IDLIST> 

Fig.  5.  BNF  definition  of  CPDL. 
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ON  A TECHNIQUE  FOR  DYNAMIC  ROUTING 
R.R.  Boorstyn  and  A.  Livne 

Many  computer  communication  networks  use  dynamic  routing  schemes  to  com- 
pensate for  input  traffic  variations,  to  respond  to  changes  in  topology,  and  to  take  ad- 
vantage of  temporary  changes  in  loading  in  different  paths.  These  adaptive  schemes 
are  complex  and  are  usually  chosen  and  verified  by  extensive  simulations.  Invariably 
during  the  design  of  a network  they  are  replaced  by  analytically  tractable  non-dynamic 

(static)  schemes.  We  present  here  a dynamic  routing  scheme  for  which  we  have  been 

1 2 

able  to  derive  approximate  analytical  models.  ’ Furthermore,  we  can  establish  the 
efficiency  of  this  scheme,  especially  in  heavily  loaded  situations. 

A typical  static  routing  scheme  would  operate  as  follows.  Consider  as  separate 
commodities  the  messages  originating  at  a particular  node  and  destined  for  a second 
node  in  the  network.  There  are  usually  several  good  paths  connecting  these  nodes. 

The  static  routing  scheme  would  specify  the  optimum  proportion  of  traffic  to  be  routed 
over  each  path.  Efficient  algorithms  exist  for  design  of  this  type  of  routing. 

We  can  identify  one  particular  problem  with  this  approach.  Although  good  paths 
are  indeed  found,  any  node  essentially  operates  as  a collection  of  single  server  queues 
--  one  queue  for  each  outgoing  branch.  Considering  the  node  as  a queue  with  several 
potential  servers  this  is  not  an  efficient  manner  of  operation.  Indeed  if  a node  had  k 
outgoing  branches  and  was  operated  as  a queue  with  k servers  then  when  heavily  loaded 
the  time  delay  would  be  reduced  by  a factor  of  k.  Conversely  the  throughput  can  be  in- 
creased. 

If  the  node  was  operated  as  suggested  the  above  messages  would  wander  aimless- 
ly through  the  network  and  the  total  performance  would  be  abysmal.  Our  approach  is 
to  retain  the  good  paths  for  commodities  and  yet  still  get  the  benefit  of  the  faster  per- 
formance at  the  node. 

Briefly  our  scheme  is  as  follows.  Consider  a node  as  a single  queue  with  several 
servers  (output  channels).  For  a particular  commodity,  i.e.  , a message  with  a certain 
destination,  the  use  of  some  of  these  servers  would  cause  the  messages  to  be  sent  along 
"bad"  paths  --  either  too  long  or  too  heavily  loaded.  Thus  for  each  commodity  we 
specify  a subset  of  the  output  channels  as  allowable  and  permit  the  message  to  use  any 
allowable  channel  according  to  some  discipline.  Each  commodity  appearing  at  the  node 
has  its  own  allowable  set  of  channels.  These  restrictions  force  messages  to  "good" 
paths  and  constitute  our  routing  strategy. 

We  have  found  that  the  node  retains  its  performance  advantage  as  a multiple 
queue  as  long  as  a modest  amount  of  the  traffic  has  choice.  We  have  developed 
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approximate  and  fairly  accurate  analytical  models  to  calculate  the  performance  of  these 
nodes. 

We  have  also  developed  analytical  methods  to  imbed  these  nodes  in  a network  and 
to  calculate  the  overall  performance  of  the  network.  Basically  we  find  that  if  the  aver- 
age number  of  output  channels  per  node  is  k,  then  "the  time  delay  for  messages  in  a 
heavily  loaded  network  can  be  reduced  by  as  much  as  a factor  of  k when  this  dynamic 
routing  is  used. 
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CHANNEL  ASSIGNMENTS  FOR  CELLULAR  MOBILE  TELECOMMUNICATIONS 
SYSTEMS 

R.R.  Boorstyn  and  R.J.  Pennotti 

A cellular  mobile  telecommunications  system^  uses  base  stations  distributed 
throughout  a metropolitan  area  to  link  mobile  customers  to  the  land  telephone  network. 
Each  base  is  equipped  with  enough  radios  to  serve  customers  within  the  area  (cell)  for 
which  it  is  responsible.  A radio  channel  must  be  assigned  to  each  radio  in  the  system; 
the  same  channel  may  be  assigned  to  more  than  one  radio  as  long  as  the  radios  using 
any  one  channel  are  dispersed  sufficiently  that  a mobile  using  the  channel  in  one  cell 
is  not  unduly  interfered  with  by  all  other  mobiles  using  the  same  channel.  This  con- 
straint can  be  formalized  as  a minimum  allowable  co-channel  reuse  distance  in  units 
equal  to  the  cell  radius,  since  the  required  co-channel  separation  in  a system  is  direct- 
ly proportional  to  the  radius  of  the  cells. 

For  our  purpose,  a cell  layout  will  be  defined  as  the  specification  of  cell  bound- 
aries throughout  a system  and  the  assignment  of  weights  (equal  to  the  number  of  radios 
required)  to  the  resulting  cells. 

The  Channel  Assignment  Problem  (CAP)  is  "Given  a cell  layout  and  required  co- 
channel sepaiation  distance,  how  can  channels  be  assigned  to  cells  so  as  to  minimize 
the  total  number  of  channels  used?" 

A.  The  Model 

Radio  interference  constraints  can  be  represented  by  a graph,  G(V,E),  with  a 
vertex  for  each  radio  and  an  edge  between  any  two  vertices  corresponding  to  radios 
which  cannot  use  the  same  channel.  We  will  call  this  graph  the  system  micrograph  for 
reasons  which  will  become  clear  shortly.  The  channel  assignment  problem  reduces  on 
the  system  micrograph  to  a well-known  graph-theoretic  problem:  that  of  finding  the 
chromatic  number  of  the  graph.  This  problem  has  been  studied  extensively,  and  a large 
number  of  bounds  and  coloring  algorithms  have  been  developed  in  terms  of  the  connective 
properties  of  the  graph.  Unfortunately,  the  computational  effort  necessary  to  apply 
these  results  grows  very  quickly  with  the  number  of  vertices  in  the  graph  to  which  they 
are  applied.  A graph  with  100  vertices  is  extremely  large  for  consideration  of  either 
the  chromatic  number  itself  or  reasonably  tight  bounds  on  the  chromatic  number.  We, 
on  the  other  hand,  are  interested  in  cellular  systems  which  have  thousands  of  radios, 
and  so  find  ourselves  in  the  familiar  position  of  having  at  our  disposal  theoretical  re- 
sults which  are  not  directly  applicable  to  our  problem.  In  order  to  overcome  these 
computational  difficulties,  we  have  developed  another  graph-theoretic  model  which  takes 
advantage  of  the  cellular  structure  of  the  systems  in  which  we  are  interested.  This 
model  we  call  the  system  macrograph,  M(C2,  YfW);  it  is  a graph  G^(0,  y)  with  weighted 
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vertices.  (See  Figure  1.)  W is  the  set  of  vertex  weights.  The  vertex  set,  D(M),  is  the 
set  of  cells  in  the  system.  Given  e D,  we  include  (u>^,io^)  as  an  edge  (i.e. , a mem- 

ber of  Y)  if  and  only  if  cell  i interferes  with  cell  j.  The  weight  w^  assigned  to  vertex 


is  the  number  of  channels  required  in  cell  i. 


NODE  Wi  Uj 


A(M)={a),,<ii2,ajj  ,w«,o»s ) 
r(M)=(o,b,c,d,e,f) 

W=(w,,Wj,Wj,W,,Ws) 

Fig.  1.  A macrograph,  M(D,  Y,W). 

The  system  macrograph  is  a much  more  compact  representation  of  a cellular 
system  than  is  the  micrograph.  It  replaces  all  the  vertices  representing  radios  in  the 
same  cell  with  one  weighted  macronode,  taking  advantage  of  the  fact  that  the  interference 
constraints  on  all  radios  in  the  same  cell  are  identical. 

The  correspondence  between  G and  M is  evident.  G contains  a w^^ -clique  for  each 
vertex  t D(M)  (see  Figure  2).  An  edge  e y(M)  represents  w^^  in  G, 

fully  connecting  the  w^^-clique  from  macronode  to  the  w^-clique  from  macronode 
The  weight  of  a node  in  M is  thus  the  number  of  nodes  in  G which  it  represents. 

On  the  macrograph,  the  channel  assignment  problem  becomes  the  problem  of  as- 
signing w^  distinct  positive  integers  (or  colors  or  channels)  to  each  vertex  (cell)  of  M 
so  that  the  same  integer  is  not  assigned  to  any  two  adjacent  vertices  of  M and  the  total 
number  of  assigned  integers  is  minimum.  We  can  use  the  relationship  between  macro- 
graphs and  micrographs  to  apply  known  results  about  the  chromatic  numbers  of  graphs 
to  the  channel  assignment  problem  in  a computationally  tractable  way.  We  will  illustrate 
this  technique  with  two  well-known  bounds  on  chromatic  number  in  the  next  section. 

In  subsequent  sections  we  will  present  a method  of  partitioning  a macrograph  into 
simpler  macrographs  to  compute  an  upper  bound  on  the  optimum  CAP  solution,  and 
develop  a reduction  theorem  which  allows  us  to  eliminate  from  consideration  many  of 
the  cells  in  a typical  channel  assignment  problem. 


i 
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Ul,  (t>2 


Wj  = 2 W4=3 

Fig.  2.  A macrograph  and  its  corresponding  micrograph. 

B.  The  Method 

Before  developing  bounds  on  the  minimum  number  of  channels  necessary  for  the 
CAP,  we  must  further  specify  the  relationship  between  macrographs  and  micrographs. 

The  first  theorem  along  these  lines  relates  the  cliques  of  the  micrograph  to  the 
cliques  of  the  macrograph.  We  will  see  that  there  is  a simple  one-to-one  correspond- 
ence between  the  two. 

Theorem  1:  A set  of  macronodes  is  a clique  of  the  macrograph  if  and  only  if  the 
micronodes  contained  in  them  form  a clique  of  the  micrograph.  The  correspondence 
between  micrograph  and  macrograph  cliques  is  therefore  one-to-one. 

The  significance  of  Theorem  1 becomes  evident  immediately.  If  we  need  to  enu- 
merate the  cliques  of  the  micrograph  we  need  only  enumerate  those  of  the  macrograph. 
The  size  of  a micrograph  clique  is  just  the  sum  of  the  weights  of  the  nodes  in  the  cor- 
responding macrograph  clique.  When  we  speak  of  the  largest  micrograph  clique,  we 
are  speaking  of  the  densest  (highest  weight)  macrograph  clique. 

A similar  theorem  can  be  posed  relating  macrograph  and  micrograph  maximal 
independent  sets,  the  major  difference  being  that  the  correspondence  is  no  longer  one- 
to-one.  Any  set  consisting  of  a microvertex  from  each  macrovertex  of  a macrograph 
MIS  is  a micrograph  MIS.  Thus  a macrograph  MIS  corresponds  to  many  equivalent 
micrograph  MIS's. 


*We  will  often  speak  of  the  micronodes  contained  in  a macronode.  By  this  we 
will  mean  those  micronodes  representing  radios  in  the  cell  represented  by  the  macronode. 
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Theorem  2:  A set  of  microvertices  is  a maximal  independent  set  (MIS)  if  and  only 
if 

(1)  no  two  of  the  microvertices  are  contained  in  the  same  macronode,  and 

(2)  the  macronodes  which  contain  the  microvertices  form  a MIS  of  the  macro- 
graph. 

Note  that  to  form  a micrograph  MIS,  V,  from  a macrograph  MIS,  D , we  can 
choose  for  V any  set  of  vertices  with  the  property  that  one  is  selected  from  each  node 
of  . If  the  nodes  in  D = . . . • J have  weights  W = then  the 

MIS  SI  corresponds  to  w,  xw-x.  . . xw.  MIS's  of  the  micrograph,  representing  all 
combinations  of  one  microvertex  from  each  macronode  of  D , 

We  now  discuss  two  well-known  bounds  on  the  chromatic  number  of  a graph  and 
their  relationship  to  the  CAP.  The  thrust  of  our  effort  is  to  bound  the  chromatic  num- 
ber considering  only  macrographic  properties. 

1.  Upper  Bound  (The  Maximum  Degree  Bound) 

The  chromatic  number,  X(G),  of  a graph  G,  is  bounded  above  by  the  maximum 
vertex-degree  of  the  graph  plus  one.  However,  the  degree  of  any  microvertex  contained 
in  a particular  macronode  is  just  one  less  than  the  sum  of  the  weights  of  the  containing 
macronode  and  its  neighbors  (one  less  because  the  vertex  is  not  adjacent  to  itself). 
Therefore, 

y(G)<  max  | ^ w.+ w,  ) . 

w^eD(M)  a>^er(u^)  ^ 

Given  the  macronode  weights  and  adjacencies,  we  can  compute  this  bound  by 
computing  one  sum  for  each  cell  of  the  system. 

2.  Lower  Bound  (The  Largest  Clique  Bound) 

Another  well-known  graph-theoretic  bound  on  chromatic  number  is  the  lower 
bound  given  by  the  size  of  the  largest  clique  of  the  graph.  As  mentioned  in  the  discus- 
sion of  Theorem  1,  the  largest  clique  of  the  micrograph  corresponds  to  the  densest 
(highest-weight)  clique  of  the  macrograph.  Thus,  the  number  of  channels  necessary  to 
solve  the  CAP  is  at  least  as  large  as  the  weight  of  the  densest  macrograph  clique. 
Mathematically,  we  let  C be  the  set  of  all  cliques  of  a macrograph  M(D,  Y,W)  with  cor- 
responding micrograph  G(V , E).  Then 

V(G)  > max  \ V w.)  . 

C.e  C u.E  C.  J 
1 J 1 

Note  that  the  bound  is  stated  in  terms  of  the  connective  properties  (cliques)  of  the  macro- 
graph only,  and  the  weights  of  the  macronodes. 
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It  is  an  important  lower  bound  on  the  number  of  channels  needed  to  solve  the 
channel  assignment  problem  satisfactorily;  the  bound  is  often  met  in  practical  systems. 
Unfortunately,  it  is  difficult  to  determine  whether  or  not  it  will  be  met  in  a particular 
system. 

While  we  have  presented  here  only  two  very  simple  bounds,  the  method  we  have 

outlined  has  been  successfully  applied  to  much  more  complex  bounds  and  to  quite  a 

2 

number  of  heuristic  graph-colori  .ig  algorithms. 

C.  Vertical  Partitioning 

It  is  evident  from  the  relationships  which  we  have  established  between  macrographs 
and  micrographs  that  the  chromatic  number  of  the  micrograph  is  somehow  dependent 
on  the  structure  of  the  macrograph.  For  instance,  consider  a macrograph  with  equal 
weights,  w,  on  all  the  nodes.  For  this  situation,  we  have  y(G)  < w • ylM)  where  M is 
the  macrograph  associated  with  G.  A coloring  is  obtained  for  G by  defining  a chromatic 
coloring  for  M and  letting  each  of  these  macrograph  colors  represent  w distinct  colors 
for  the  micrograph.  This  concept  can  be  extended  to  graphs  with  nonuniform  macro- 
node weights  to  derive  a bound  on  the  chromatic  number  of  any  micrograph. 

Order  the  cell  weights,  {w^^  },  of  a system  so  that  w^  < w^  < . . . < w^  where  the 
system  has  k cells  and  k > f . Further,  let  (3^  be  the  set  of  cells  with  weight  w^.  Sup- 
pose that  we  know  ■x(M)  and  assign  the  first  ■X(M)  channels  in  such  a way  that  every  cell 
in  the  system  is  assigned  one  channel,  that  is,  find  a chromatic  coloring  for  M and 
assign  one  channel  to  each  color  class.  Assuming  that  each  cell  needs  more  than  one 
channel,  we  assign  the  next  ■x(M)  channels  in  the  same  way.  We  continue  until  eventual- 
ly the  cell  needing  the  fewest  channels  has  been  satisfied.  Since  this  cell  had  weight 
Wj , we  have  now  assigned  ylM)  • Wj  channels.  The  cells  in  set  Pj  are  now  satisfied; 
consider  next  a new  macrograph  with  these  ceils  eliminated.  Since  the  remaining  cells 
have  been  assigned  Wj  channels,  subtract  Wj^  from  their  original  weights.  If  we  now 
find  the  chromatic  number  of  the  new  macrograph,  we  can  proceed  as  we  did  with  the 
original  problem  until  we  have  completely  satisfied  all  cells  needing  w^  channels.  This 
process  can  be  continued  iteratively  until  the  entire  system  is  covered.  The  assign- 
ment which  results  is  obviously  an  upper  bound  on  the  minimum  assignment.  This  is 
a vertical  partition  of  the  original  macrograph;  we  partition  the  weight  of  each  macro- 
node  into  parts,  one  satisfied  at  each  iteration  of  the  above  process. 

The  principle  of  vertical  partitioning  is  even  more  general  than  the  above  examples 
suggest.  Suppose  we  haye  a macrograph  M(U,  Y>W)  corresponding  to  the  micrograph 
G(V,E).  Consider  k copies  of  the  macrograph  differing  only  in  their  node  weights. 
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Call  them  M,(D,Y,W<^h,  M,(n,Y.W^^h,  ....  M.(n,  Y.  and  let  them  correspond  to 

14  l{\  l£\ 

micrographs  Gj,  G^,  ....  Gj^  respectively.  Furthermore  let  the  cW  satisfy 


V = w. 
i^l  J ' 


j = 1,2 n where  n is  the  number  of  macronodes  in  the  original  macrograph  (see 

Figure  3).  We  can  constrain  ourselves  to  serve  each  of  the  k copies  with  disjoint  sets 
of  channels  and  solve  the  problems  independently.  This  does  not  guarantee  an  optimum 
solution  to  the  original  problem  but  it  provides  a bound  on  the  optimum. 


'*»  jM(A,y,W) 


COPY  I 


COPY  2 


COPY  k 


Fig.  3.  A vertical  partition. 


D.  The  Reduction  Theorem 

One  of  the  most  useful  results  of  this  report  can  be  derived  from  a bound  on  chro- 
matic number  due  to  Matula.^  It  states  that  for  any  cut.  y(G)  is  bounded  above  by  the 
maximum  of  the  chromatic  numbers  of  the  resulting  subgraphs  or  the  number  of  edges 
cut  plus  one.  whichever  is  larger.  Note  that  if  one  of  the  subgraph  chromatic  numbers 
is  the  largest  of  these  quantities,  the  bound  must  be  met  with  equality,  since  the  chro- 
matic number  of  the  graph  cannot  be  smaller  than  the  chromatic  number  of  one  of  its 

subgraphs . 
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Now  consider  a star-cut’  of  the  micrograph.  The  chromatic  number  of  one  of  the 
resuiting  subgraphs  (that  with  only  one  vertex  in  it)  can  be  neglected  since  it  is  only 
equal  to  one.  Moreover,  the  number  of  edges  cut  is  equal  to  the  degree  of  the  vertex 
which  is  isolated.  The  bound  thus  leads  to  the  following  assertion  (the  Reduction  The- 
orem); 

If  the  chromatic  number  of  the  subgraph  remaining  after  a vertex  is  removed 
from  the  graph  is  greater  than  the  degree  of  the  removed  vertex,  then  the 
chromatic  number  of  the  original  graph  is  equal  to  the  chromatic  number  of 
the  subgraph. 

Suppose  we  have  a lower  bound  on  the  chromatic  number  of  the  subgraph,  e.g.  , 
the  largest  clique.  If  the  degree  of  the  removed  vertex  is  less  than  this  bound,  then 
the  vertex  need  not  be  considered  when  searching  for  the  chromatic  number  of  the  graph. 
This  theorem  can  be  used  iteratively;  since  each  successful  step  removes  a vertex 
from  the  graph,  the  degrees  of  some  of  the  remaining  vertices  are  decreased.  Thus  a 
vertex  which  does  not  meet  the  condition  on  the  first  iteration  may  still  be  eliminated 
in  a subsequent  iteration.  Once  a subgraph  has  been  reached  which  cannot  be  reduced 
further  with  this  theorem,  and  it  is  colored  with  some  number,  n,  of  colors,  the  elim- 
inated vertices  may  be  colored  in  the  reverse  order  of  their  elimination  without  using 
any  new  colors.  This  is  so  because  the  degree  of  each  vertex  in  the  subgraph  from 
which  it  was  removed  had  to  be  less  than  n for  it  to  be  eliminated. 

So  far  we  have  spoken  of  the  Reduction  Theorem  as  it  applies  to  micrographs; 
but  we  have  stressed  ail  along  that  the  micrograph  is  an  unacceptable  model  for  the 
CAP  because  of  its  size.  We  must  therefore  adapt  the  Reduction  Theorem  to  macro- 
graphs in  order  to  take  full  advantage  of  it.  This  can  be  done  through  the  use  of  two 
observations  made  earlier.  The  first  is  that  the  degrees  of  all  microvertices  in  a par- 
ticular macronode  are  the  same.  Thus,  if  one  microvertex  can  be  eliminated,  then  all 
other  microvertices  in  the  same  macronode  can  be  eliminated.  The  second  observation 
is  that  the  degree  of  a microvertex  can  be  determined  from  macrographic  properties 
alone;  it  is  one  less  than  the  sum  of  the  weights  of  the  containing  macronode  and  its 
neighboring  macronodes.  The  Reduction  Theorem  can  hence  be  applied  directly  to 
macrographs,  through  computition  of  sums  of  macrograph  weights , entire  macronodes 
can  be  eliminated  from  consideration  in  the  CAP. 

The  use  of  the  Reduction  Theorem  is  illustrated  through  the  example  in  Figure  4. 
The  densest  clique  consists  of  macronodes  1 to  3 and  gives  a lower  bound  of  25.  The 
degree  of  each  microvertex  in  macronode  4 is  28  while  the  degree  of  each  microvertex 


star-cut  of  a graph  separates  one  vertex  from  the  remainder  of  the  graph. 


I; 
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Fig.  4.  Example  for  reduction  theorem. 

in  macronode  5 is  21.  Therefore  macronode  5 can  be  eliminated  immediately.  But  the 
degree  of  each  microvertex  ih  macronode  4 with  respect  to  the  remaining  micrograph 
is  16  and  so  macronode  4 can  also  be  eliminated.  Now  if  the  first  seven  channels  are 
assigned  to  cell  1,  the  next  eight  to  cell  2,  and  the  next  ten  to  cell  3,  the  micrograph 
remaining  after  using  the  Reduction  Theorem  will  have  been  chromatically  colored. 

We  may  assign  channels  24  and  25  to  cell  4 and  channels  1 through  7 and  16  through  18 
to  cell  5 and  the  solution  is  complete. 

Figures  5 to  7 are  cell  layouts  for  a real  system  at  various  stages  of  growth.  The 


Fig.  5.  Cell  layout  No.  1.  Fig.  6.  Cell  layout  No.  2 


first  shows  the  system  shortly  before  a second  cell-size  is  introduced,  the  second  shows 
the  system  just  after  the  densest  cells  have  been  split  and  the  last  shows  a mature  sys- 
tem close  to  saturation,  given  the  smallest  cell-size  and  available  spectrum.  They 


Fig.  7.  Cell  layout  No.  3. 

were  reduced  to  7,  25,  and  15  cells  respectively.  The  latter  two  examples  give  an  in- 
dication of  the  value  of  the  Reduction  Theorem  in  systems  with  two  cell-sizes.  The 
fact  that  it  performs  well  in  these  layouts  is  significant  since  their  lack  of  symmetry 
makes  them  more  difficult  to  analyze  than  single  cell-size  layouts.  The  effectiveness 
of  the  Reduction  Theorem  in  the  mature  system  of  Fig.  7 is  especially  encouraging. 

E.  Summary 

We  have  presented  in  this  report  a number  of  techniques  which  can  be  used  to  make 
the  Channel  Assignment  Problem  in  cellular  mobile  telecommunications  systems  a 
tractable  problem. 

The  first  technique  is  simply  the  development  of  the  macrographic  model.  But 
many  systems  have  so  many  cells  that  this  is  not  sufficient.  The  Reduction  Theorem 
allows  us  to  reduce  almost  any  system  to  its  most  densely  populated  core  of  cells.  Once 
this  has  been  done,  the  macrographic  representations  of  well-known  bounds  and  algo- 
rithms  can  be  used  to  solve  the  CAP  on  these  dense  cells.  As  pointed  out  in  the  last 
section,  the  cells  which  were  eliminated  can  then  easily  be  assigned  channels  out  of 
those  used  for  the  central  core. 
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QUEUEING  ANALYSIS  OF  INTERNODAL  ACKNOWLEDGMENT  IN  COMPUTER- 
COMMUNICATION  NETWORKS 
R.  Boorstyn  and  J.  Stark 

In  order  to  provide  reliable  communications  in  a computer  network,  messages 
must  be  verified  and  corrected  if  in  error.  The  systems  to  be  studied  in  this  work  use 
automatic -repeat-request  (ARQ)  to  correct  errors.  If  a message  is  detected  to  be  in 
error,  then  the  receiving  node  will  request  that  the  message  be  repeated.  For  the 
transmitting  node  to  repeat  the  message,  it  must  store  all  messages  which  it  transmits 
until  it  is  certain  that  they  have  been  properly  received.  Thus  the  buffer  contains  both 
messages  waiting  to  be  transmitted  and  those  waiting  to  be  acknowledged.  This  study 
attempts  to  develop  techniques  which  can  minimize  this  holding  time  and  total  buffer 
occupancy  and  thereby  increase  the  throughput  between  the  two  nodes. 

Various  methods  have  been  used  to  return  acknowledgements.  In  the  early  ARPA 
system,  a short  message  carrying  the  acknowledgement  was  placed  on  the  output  buffer 
with  priority  over  all  other  messages.^  This  method  however  degraded  the  message 
throughput  because  of  the  constant  flow  of  acknowledgement  messages.  To  alleviate  the 
problem,  a piggyback  scheme  was  implemented.  The  purpose  of  this  study  is  to  com- 
pare two  different  piggyback  schemes  so  that  techniques  can  be  developed  which  will 
minimize  the  holding  time  and  thereby  increase  the  throughput  between  the  two  nodes. 

Two  different  types  of  piggyback  return  schemes  will  be  compared.  In  both  of 
these  schemes,  a certain  number  of  bits  of  each  message  are  reserved  for  the  acknowl- 
edgement information  of  the  reverse  channel.  In  the  first  system,  which  we  call  pure 
piggyback  (PP),  when  there  are  no  messages  to  be  transmitted,  the  acknowledgement 
information  is  stored  until  a message  which  can  carry  back  this  information  is  proces- 
sed, In  the  second  system,  which  we  call  ack  generated  piggyback  (AGP),  when  a return 
message  is  not  available,  the  acknowledgement  information  will  not  be  stored.  Instead, 
this  information  will  generate  its  own  acknowledgement  message  on  which  it  can  return 
to  the  originating  node. 

2 

Previous  work  in  this  area  has  included  that  of  A.  Danthine  and  J.  Bremer  and 
S.  Lam.^  Lam  studied  a model  similar  to  our  ack-generated  piggyback  system.  How- 
ever, he  neglected  the  dependence  that  exists  between  the  two  nodes.  Specifically,  the 
length  of  time  it  takes  for  an  acknowledgement  to  return  to  the  originating  node  is  de- 
pendent on  the  amount  and  type  of  traffic  at  the  receiving  node. 

A queueing  model  which  can  take  into  account  the  effects  of  different  acknowledge- 
ment protocols  of  a node  pair  is  shown  in  Fig.  1 and  is  described  by  the  following  var- 
iables: 
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Fig.  1.  Simplified  queueing  model  of  a node  pair. 

= average  number  of  messages  per  second  which  enter  the  originating  node 
and  are  destined  for  the  receiving  node. 

= average  total  traffic  which  is  transmitted  from  the  originating  node  to  the 
receiving  node  (messages  per  second). 

A similar  set  of  variables  is  also  defined  for  the  receiving  node,  where  1 and  2 are 
interchanged.  The  originating  station  is  labeled  1 and  the  receiving  station  is  labeled 
2.  For  simplicity,  errors  in  message  and  acknowledgement  transmission  and  time-out 
errors  will  be  ignored. 

A.  Pure  Piggyback 

In  the  pure  piggyback  system,  each  transmitted  message  has  a certain  number  of 
bits  which  are  reserved  exclusively  for  the  transmittal  of  all  acknowledgements  which 
are  to  be  returned.  The  acknowledgement  can  only  be  transmitted  as  part  of  a message. 
Therefore,  if  there  are  no  messages  to  be  transmitted  in  one  direction  only,  the  ac- 
knowledgements of  the  messages  in  the  opposite  direction  will  queue,  filling  the  avail- 
able buffer  space  and  cause  all  useful  communication  to  cease. 

In  order  to  obtain  initial  solutions,  certain  assumptions  were  made  in  the  model. 

It  will  be  assumed  that  the  error  probabilities  are  zero.  This  implies  that  messages 
will  rarely  be  repeated  and  therefore,  the  time-out  property  of  the  queue  can  be  neglected. 
In  addition,  and  for  simplicity,  it  is  assumed  that  all  messages  waiting  to  be  acknowledg- 
ed, will  be  properly  acknowledged  when  the  current  message  is  receiveo. 
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Once  a message  begins  transmission  its  content  is  generally  fixed  for  the  duration. 
The  acknowledgement  information  can  be  contained  at  the  end  of  a message.  Thus,  if 
a message  arrives  while  a return  message  is  in  the  process  of  being  transmitted,  its 
acknowledgement  could  be  added  to  the  message.  Although  this  is  not  usually  done,  we 
asstome  this  operation  here  to  simplify  the  model.  These  assumptions  tend  to  reduce 
the  blocking  probability  due  to  messages  waiting  to  be  acknowledged. 

Due  to  the  second  assumption  above,  the  number  of  messages  waiting  to  be  ac- 
knowledged at  node  1 is  equal  to  the  number  of  acknowledgements  in  the  ACK  Queue  of 
node  2.  The  probability  of  being  in  any  state  S is  represented  as  ^ a iM2A2^^®*’® 

Mj  = number  of  messages  to  be  transmitted  at  node  1 

M^  = number  of  messages  to  be  transmitted  at  node  2 

Aj  = number  of  messages  that  are  waiting  to  be  acknowledged  at  node  1 

- number  of  messages  that  are  waiting  to  be  acknowledged  at  node  2 

If  N^  and  N^  are  the  maximum  buffer  size  at  node  1 and  node  2 respectively,  then 

Nj  > M^  + Aj 

and 

N2  ^ ^ '^2 

We  will  first  solve  the  problem  for  the  case  when  the  maximum  buffer  size  is  two, 
and  then  show  how  the  problem  would  be  solved  for  a general  buffer  size.  Figure  2 
shows  the  complete  state  diagram  for  the  case  when  N^  = N^  = 2. 

The  set  of  states  {(0000),  (0010),  (1000),  (1010),  (0020),  (1020),  (2020),  (2010), 
(2000)}  form  an  initialization  set,  which  can  be  neglected  when  obtaining  the  steady  state 
solution.  The  system  can  be  in  one  of  these  states  initially,  but  once  the  system  leaves 
the  initialization  set,  the  system  will  never  return  to  any  state  in  this  set.  The  system 
leaves  the  initialization  state  when  the  first  message  is  transmitted. 

Because  of  the  previous  assumption  that  an  acknowledgement  can  piggyback  on  to 
messages  which  are  in  the  process  of  being  transmitted,  the  set  of  states  {(0201),  (0101), 
(0202),  (0102)}  are  unreachable.  Since  the  probability  of  being  in  any  of  these  states 
when  the  system  has  reached  steady  state  is  zero,  they  can  be  neglected. 

In  order  to  simplify  the  solution  somewhat,  we  will  assume  that  the  network  is 
symmetric.  Namely,  that  the  input  steady  state  traffic  rate  is  the  same  for  both  nodes. 
Also  the  transmitting  line  speed  is  assumed  identical  in  both  directions  between  the 
two  nodes.  These  assumptions  are  often  valid  for  distributed  networks.  The  assump- 
tion of  equal  traffic  rates  would  usually  not  be  correct,  however,  for  centralized  or 
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Fig.  2.  Probability  state  diagram  of  the  two  buffer 
pure  piggyback  system. 

star  networks.  It  should  be  noted,  that  the  acknowledgement  system  being  discussed 
here  should  not  be  used  in  a network  where  the  traffic  rate  is  extremely  unbalanced 
between  the  two  nodes.  In  such  a network,  the  transmitted  messages  of  the  busy  node 
will  have  to  wait  an  inordinate  amount  of  time  for  acknowledgements  to  return,  causing 
serious  blocking  problems  at  the  node. 

Due  to  the  symmetric  assumptions  PM1A1M2A2  " PM2A2M1A1  probability 
state  diagram  for  the  two  buffer  case  now  reduces  to  nine  states  as  shown  in  Figure  3. 
The  set  of  state  equation  that  need  to  be  solved  are  easily  found  to  be; 
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Fig.  3.  Probability  state  diagram  of  the  two  buffer 
symmetric  pure  piggyback  system. 


^ P^OIOO  " ^0210  ^0110 

(1) 

(p+DPquo  " P^OIOO  P^OIZO  ^1110 

(2) 

^0120  ^ P ^0110 

(3) 

(p+l)Pll00  = pPqIOO  ^ ^0120  ^0220 

(4) 

(P+  2)  Pj^Q  = pPq^o  P^llOO  ^1120 

(5) 

^1120"  P^lllO'*'  P^0120 

(6) 

P ^0200  ^ ^1100 

(7) 

^P’’’  ^0210  ^ P^0200  ^1110 

(8) 

P = n P + P 

0220  P 0210  ^ 1120 

(9) 

where 


The  solution  of  these  equations  and  thereby  the  entire  state  diagram  can  be  obtain 
ed  numerically  for  specific  values  of  p in  the  interval  0 < p < 1. 

Once  the  state  probabilities  of  the  system  are  known,  then  such  system  character 
istics  as  the  probability  of  blocking,  the  average  time  delay  of  message  and  scknowledg 
ments,  and  the  throughput  of  the  network  can  be  determined.  The  results  have  been 
plotted  in  Figures  4 to  8.  There  are  three  different  means  by  which  a node  can  be 
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Fig.  8.  Total  time  delay  for  the  pure  piggyback  system. 


blocked: 

(1)  Acknowledgement  Blocking  (BA):  occurs  when  all  the  available  buffers  of  a 
node  are  being  used  to  store  messages  which  have  been  transmitted  but  have 
not  been  acknowledged.  The  probability  of  its  occurrence  for  this  example 

is  equal  to  the  probability  of  being  in  any  of  the  states  {(0200),  (0210),  (0220)}. 

(2)  Message  Blocking  (BM):  occurs  when  all  the  available  buffers  of  a node  are 
being  used  to  store  messages  waiting  for  transmission  to  the  next  node.  The 
probability  of  its  occurrence  for  this  example  is  equal  to  the  probability  of 
being  in  any  of  the  states  {(201  1),  (2022),  (2001)}. 

(3)  Total  Blocking  (BT):  occurs  when  all  available  buffers  of  a node  are  being 
used.  The  probability  of  its  occurrence  for  this  example  is  equal  to  the 
probability  of  being  in  any  of  the  states  {(0200),  (0210),  (0220),  (201  1),  (2002), 
(2001),  (1100),  (1110),  (1120)}. 

Two  interesting  facts  are  observed  from  the  graphs  of  blocking  probability.  First, 
the  acknowledgement  blocking  probability  is  almost  independent  of  p and  has  a constant 
value  of  0.25.  Further,  the  total  blocking  probability  is  simply  a linear  function  of  p, 
the  function  being:  P(BT)  = 0.  31  p + 0.  25. 
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In  order  to  compare  the  various  cases  which  are  to  be  studied,  it  is  necessary  to 
introduce  the  concept  of  effective  p.  It  is  determined  by  using  the  mean  rate  of  mes- 
sages that  actually  enter  the  system  and  are  not  blocked.  This  arrival  rate  is  defined 
as 

X.'  = \(1  - P(BT))  . 

The  effective  p is  actually  a measure  of  the  throughput  of  the  system.  While  it  is  pos- 
sible to  operate  the  system  with  the  value  of  p greater  than  one,  the  value  of  effective 
p can  never  be  greater  than  one.  This  however  would  not  be  a very  efficient  system 
since  most  messages  would  be  blocked  and  have  to  be  repeated.  The  effective  p which 
is  plotted  in  Fig.  6,  starts  off  equal  to  p initially,  but  then  begins  to  increase  less 
rapidly  and  eventually  would  become  constant  for  large  values  of  p.  This  type  of  curve 
is  true  for  all  cases  which  we  have  studied. 

The  total  delay  time  of  a message  is  defined  as  the  sum  of  the  message  delay  time 
from  the  transmitting  node  to  the  receiving  node  plus  the  delay  time  of  the  returning 
acknowledgement.  This  is  the  total  time  that  the  buffer  must  store  a message.  The 
various  time  delays  are  found  by  application  of  Little's  formula.  The  average  number 
of  messages  in  each  of  the  queues  is  found  from  the  computer  results.  This  value  is 
divided  by  the  actual  rate  of  messages  entering  the  system.  Since  it  is  asstimed  that 
messages  and  acknowledgements  are  not  lost  in  the  system,  the  acknowledgement  ar- 
rival rate  is  equal  to  the  message  arrival  rate. 

The  message  delay  time  initially  starts  at  zero  and  then  increases  rapidly  as  the 
effective  p increases.  The  total  delay  time,  however,  starts  at  one  unit  of  time  and 
then  increases  rapidly  as  a function  of  effective  p.  This  peculiarity  is  due  to  the  delay 
time  of  an  acknowledgement  in  this  system.  Even  with  no  new  input  traffic  the  previous 
message  must  still  be  acknowledged.  However,  the  acknowledgement  can  not  be  trans- 
mitted until  a new  message  arrives.  This  minimum  delay  time  would  be  reduced  if  a 
time-out  period  of  less  than  one  time  unit  was  used. 

B . Solution  for  General  Buffer  Size 

In  the  previous  example  the  storage  buffer  was  limited  to  only  two  storage  loca- 
tions. While  this  is  an  interesting  problem,  in  most  practical  systems,  buffer  size 
would  be  larger.  In  order  to  obtain  some  insight  into  this  problem,  the  Markov  state 
diagram  must  first  be  obtained.  This  diagram  for  the  general  buffer  case  has  been  ob- 
tained and  the  appropri-  .te  set  of  difference  equations  developed. 

To  find  the  state  probabilities  from  these  equations  is  extremely  complex.  In 
order  to  solve  these  equations  a computer  program,  capable  of  obtaining  solutions  for 
any  finite  buffer  size,  was  written.  To  write  the  balance  equation  for  every  state  very 
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quickly  becomes  an  enormous  and  time  consuming  task  for  the  computer.  This  is  ap- 


parent when  one  realizes  that  the  number  of  states  in  this  case  is  equal  to  N(N  + 1)  /2, 


where  N is  the  maximum  number  of  messages  a buffer  can  store.  We  have  used  a 

2 2 . . 

technique  which  reduces  the  number  of  equations  to  N . The  set  of  N equations  which 


must  eventually  be  solved  only  contains  the  unknown  states  PmiAiM2A2’ 

A = 0,  N -1  > M,  > 0,  N - 1 > M,  > 0}.  These  are  just  the  states  of  the  first  plane 
(A^  = 1)  of  the  state  diagram.  It  is  possible  to  obtain  all  the  other  states  in  terms  of 
these  first  plane  states.  For  example,  the  states  of  the  first  row  (Mj  > 0,  M^  = 0, 

Aj  > 1)  can  all  be  written  in  terms  of  the  states  1qO>  ‘ l>Mj>0}.  Similarly, 

the  state  probabilities  of  the  second  row  ^ - A^  > M^  > 0,  N > A^  > 2} 

can  be  written  in  terms  of  the  state  probabilities  of  the  first  two  rows  of  the  first  plane 


fp  N - 1 > M > 0,  1 > M,  >0].  This  process  can  be  continued  until  all  state 

^ Mj1M20>  1 — “ ^ 

probabilities  have  been  found.  By  use  of  this  algorithm  we  were  able  to  obtain  solutions 


for  a maximum  buffer  size  of  8,  in  less  than  3 minutes  and  using  only  180K  bytes  of 


storage. 


Results  for  the  General  Buffer  Case 


Using  the  computer  program  described  above,  this  example  was  solved  for  maxi- 
mum buffer  sizes  of  five  and  eight.  In  these  cases,  the  effects  of  blocking  are  of  less 
importance.  Therefore,  the  value  of  effective  p does  not  deviate  as  much  from  the 
actual  value  of  p.  The  various  blocking  probabilities  which  were  previously  defined 
can  now  be  determined  from  the  following  general  equations: 


P(BA) 


^2=0 


ONM^O 


N-l^-^l 


P(BM)=  V Y P 


M^=0Aj=l 


MjA^NO 


N-1  N 

P(BT)  = P(BA)  + P(BM)  + y y Pm.N-M.M,0 


Mj=l  M2=0 


Similarly,  the  various  average  time  delays  can  be  determined  by  applying  Little's  formula 
to  the  average  number  in  each  buffer,  which  are  determined  from  the  following  equations; 


N N-1 

Z L ^l^M.A  M 0 

M2=0  Mj  = 0 Aj=1  1 1 2 


N-1 


N = ^ (M,  + M,) 


M2=0  Mj  = 0 Aj=1 


M,A,M-0 
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N = N + N 
T A M 


The  effective  message  input  rate  is  determined  exactly  as  before 
= Ml  - P(BT)). 

The  results  for  these  two  cases  have  been  plotted  in  Figures  6,  7,  and  8.  As  is  to  be 
expected,  the  blocking  probabilities  decrease  dramatically  as  the  buffer  size  is  in- 
creased. From  Fig.  7 it  is  seen  that  the  blocking  probability  curve  of  the  8 buffer  sys- 
tem can  be  approximated  by  that  of  an  acknowledgement  free  system.  It  should  be  noted 
that  the  effective  p,  which  is  related  to  the  blocking  probability,  for  the  8 buffer  case 
is  approximately  equal  to  p until  the  traffic  rate  becomes  very  high.  This  of  course  is 
to  be  expected,  since  so  few  incoming  messages  must  be  rejected  by  the  system  for 
this  case. 


The  acknowledgement  time  delay  again  has  an  initial  value  of  one,  but  now  remains 
virtually  constant  for  all  values  of  p.  Of  greater  interest  is  the  total  time  that  a mes- 
sage is  stored  at  a node.  This  characteristic  is  interesting  in  that  when  it  is  plotted 
against  p,  the  average  time  delay  for  large  p increases  as  the  buffer  size  increases. 

This  is  not  surprising,  though  when  one  realizes  that  due  to  blocking  the  small  buffer 
limits  the  m.aximum  message  delay.  For  the  larger  buffer,  however,  these  are  occasions, 
although  rare  when  the  maximum  of  eight  messages  are  awaiting  transmission.  When 
plotted  against  effective  p then  it  is  seen  that  the  total  delay  time  of  the  2 buffer  case 
is  indeed  the  poorest.  Also,  while  the  total  time  delay  of  the  8 buffer  case  is  initially 
slightly  lower  than  that  of  the  5 buffer  case,  it  is  increasing  at  a faster  rate  and  finally 
surpasses  that  of  the  5 buffer  case  when  the  effective  p is  extremely  high. 


From  these  results,  it  is  apparent  that  while  2 buffers  is  not  enough  storage,  5 
buffers  should  be  sufficient.  Increasing  the  number  of  buffers  to  8,  will  only  increase 
the  throughput  slightly  and  will  not  significantly  improve  the  average  time  delay  of  the 
system  . 

D.  Ack  Generated  Piggyback  System 

While  the  pure-piggyback  system  is  quite  good  and  easy  to  implement,  it  suffers 
from  one  major  drawback.  This  is  the  problem  of  large  time  delays  for  acknowledge- 
ments under  light  load  conditions.  In  practice,  this  problem  is  alleviated  somewhat  by 
the  use  of  a time  out  mechanism.  This  is  not  a very  satisfying  solution  since  it  will 
cause  unnecessary  repetition  of  many  good  messages.  A far  better  system,  which  is 
to  be  studied  here,  is  obtained  through  the  use  of  ack  generated  messages.  In  this  sys- 
tem, when  it  is  required  to  return  an  acknowledgement  but  there  are  no  messages  being 
transmitted  to  the  originating  node,  then  a special  ack  message  will  be  transmitted. 


a 


-M 


r 
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As  in  the  case  of  regular  messages,  it  is  possible  for  this  message  to  return  more  than 
one  acknowledgement.  However,  the  ack  message  need  not  have  the  same  average  size 
as  a regular  message.  In  our  model,  once  the  decision  is  made  to  create  an  ack  mes- 
sage, it  will  be  created  and  transmitted  even  if  a regular  message  is  received  destined 
for  the  originating  node.  The  ack  messages  are  of  course  not  to  be  acknowledged. 


In  order  to  solve  the  problem,  the  probability  state  diagram  and  its  associate  set 
of  difference  equations  must  first  be  determined.  From  this  a computer  program, 
similar  to  that  used  for  the  PP  system,  can  be  solved  for  the  complete  set  of  state 
probabilities  of  the  AGP  system.  Since  acknowledgements  are  not  to  be  acknowledged, 
it  is  necessary  to  differentiate  between  an  ack  message  and  a regular  message.  This 
is  accomplsihed  by  means  of  an  additional  dimension  for  the  state  probabilities.  This 
of  course,  further  complicates  the  state  diagram  and  any  possible  solution.  The  state 
probabilities  are  now  represented  by,  Pj^  A M A L 


Mj,  A 


0. 


1 ’ ^2’ 


if  M^  is  a regular  message 
if  M2  is  an  acknowledgement  message 
are  the  same  as  previously  defined. 


In  order  to  allow  the  size  of  an  acknowledgement  message  to  be  different  from 
that  of  a regular  message,  another  parameter  s is  used.  This  parameter  s is  defined 
as  the  ratio  of  the  acknowledgement  message  size  to  that  of  the  regular  message  size. 
As  for  the  PP  system,  this  system  will  first  be  solved  for  a maximum  buffer  size  of 
two.  It  will  then  be  shown  how  the  system  can  be  solved  in  the  general  buffer  case. 


It  should  be  noted  that  the  previous  initialization  state  set,  in  which  both  acknowl- 
edgement buffers  are  empty,  can  now  exist.  However,  the  set  of  states  for  which  an 
acknowledgement  was  waiting  for  a message  does  not  now  exist.  The  state  diagram  for 
this  system  does  not  possess  the  complete  symmetry  found  in  the  PP  system.  There 
is  no  symmetric  pair  in  the  complete  state  diagram  for  the  states  (00000),  (10100), 
(20200). 


Again,  the  solution  of  the  state  equations  can  readily  be  obtained  for  specific  values 
of  p in  the  interval  0 < p < 1.  From  these  results,  all  the  state  probabilities  can  be 
determined  and  the  entire  system  characterized. 

To  compare  the  two  systems,  similar  system  characteristics  were  determined 
for  the  AGP  system  as  were  determined  for  the  previous  PP  system.  A plot  of  P(BT) 
vs.  p for  various  values  of  the  parameter  s is  shown  in  Figure  9.  For  this  system 
P(BT)  is  initially  zero.  Of  interest  is  the  fact  that  P(BT)  for  this  system,  when  the 
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Fig.  9.  Blocking  probabilities  of  the  two  buffer 
ack  generated  piggyback  system. 

acknowledgement  message  size  is  equal  to  the  message  size,  approaches  asymptotical- 
ly the  curve  of  P(BT)  for  the  PP  system.  This  of  course,  is  to  be  expected.  When  p 
is  close  to  one,  there  will  be  a continuous  stream  of  messages  to  be  transmitted,  so 
that  there  will  always  be  a message  available  on  which  an  acknowledgement  can  be  piggy- 
backed. It  should  also  be  noted  that  as  the  acknowledgement  size  is  decreased  to  zero, 
the  blocking  probability  is  improved  by  at  least  20%  for  large  values  of  p,  and  by  over 
50%  for  small  values  of  p.  This  improvement  is  due  to  the  decrease  of  the  average  de- 
lay time  of  an  acknowledgement,  especially  for  low  values  of  p. 

The  various  time  delays  were  also  plotted  against  effective  p in  Figure  10.  These 
plots  show  the  tremendous  improvement  possible,  as  the  acknowledgement  size  is  de- 
creased. When  s > . 5,  the  average  message  delay  time  of  this  system  is  greater  than 
that  of  the  PP  system.  This  is  due  to  the  increase  in  traffic  because  of  the  acknowledge- 
ment messages  that  are  transmitted.  However,  this  is  not  the  case  for  the  more  im- 
portant characteristics  of  total  average  delay.  For  this  system  the  total  time  delay  is 
initially  zero  and  then  increases  approaching  the  total  time  delay  of  the  PP  system  for 
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Fig.  10.  Time  delays  of  the  two  buffer  ack 
generated  piggyback  system. 

large  values  of  p.  In  the  case  of  s=  I,  the  two  curves  actually  cross  when  p'  = .4. 

Above  this  vahie  the  time  delay  of  the  PP  system  is  better  than  that  of  the  AGP  system. 

E.  Solution  for  General  Buffer  Size 


As  was  done  for  the  Pure  Piggyback  system,  this  system  was  solved  for  a general 
buffer  size.  This  problem  is  solved  by  a similar  type  of  computer  program  as  was  used 
for  the  pure  piggyback  system.  The  program  reduces  the  number  of  unique  states  from 

1 'i  1 2 

j(N+l)(2N  +N+2)to^(3N  + 3N  + 2),  where  N is  the  maximum  number  of  messages 
a buffer  can  store.  The  set  of  states  that  are  to  be  solved  are: 

0 < Ml  < N 
Ml  < M2  < N 

and 


M1OM2OO' 


^Mi1M200  ’ 


0 < Ml  < N -1 

1 < M2  < N 
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Once  these  state  probabilities  are  determined,  the  program  can  solve  for  all  the  state 
probabilities  of  the  system.  By  use  of  the  computer  algorithm,  a solution  for  a maxi- 
mum buffer  size  of  5 was  obtained. 

F.  Results  for  the  General  Buffer  Case 

By  means  of  the  computer  program,  the  various  system  characteristics  as  a func- 
tion of  p were  obtained.  The  set  of  blocking  probability  curves,  shown  in  Fig,  12,  are 
similar  to  blocking  probability  curve  of  the  5 buffer  pure  piggyback  system.  Again, 
the  AGP  system  is  better  at  low  values  of  p,  but  is  similar  to  the  PP  system  at  large 
values  of  p.  The  effect  of  the  acknowledgement  message  size  is  not  as  great  as  it  was 
for  the  2 buffer  case,  since  the  blocking  probability  is  already  low.  Similar  results 
are  obtained  for  the  various  time  delays.  The  message  time  delay  for  the  AGP  system 
is  greater  than  for  the  PP  system.  However,  the  total  delay  for  the  AGP  system  is 
less  than  that  of  the  PP  system , for  values  of  £ . 65  for  s = 1,  and  p^  < .77  for  s = 0. 
Above  these  values,  the  PP  system  is  better.  The  value  of  effective  p,  plotted  in  Fig.  11, 
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Fig.  11.  Effective  p for  the  ack  generated 
piggyback  system. 


COMPUTERS  AND  COMPUTER-COMMUNICATION  NETWORKS 


•425 


is  not  significantly  affected  by  the  acknowledgement  message  size.  It  is  approximately 
equal  to  p,  until  p > . 6.  The  effective  p for  the  AGP  system  is  slightly  better  than  that 
of  the  PP  system,  until  p = . 75  for  s = 1,  and  until  p = . 9 for  s = 0.  Above  these  values 
the  effective  p of  the  PP  system  is  larger.  It  should  be  remembered  that  the  effective 
p is  proportional  to  the  throughput  of  the  system.  Figure  13  gives  the  time  delay  under 
similar  conditions. 

G.  Conclusions  and  Future  Work 

From  the  results  of  the  work  reported  here,  it  appears  that  5 buffers  would  usually 
be  sufficient.  The  AGP  system  is  better  than  the  PP  system  at  low  to  moderate  traffic 
loads.  At  heavy  traffic  load,  the  reverse  is  true.  This  is  due  to  the  decrease  in  over- 
head required  by  the  PP  system.  The  point  where  this  reversal  occurs  is  primarily 
dependent  on  the  acknowledgement  message  size  of  the  AGP  system.  This  reversal 
even  occurs  when  the  size  of  the  acknowledgement  size  is  zero.  These  results  are  in- 
teresting, since  a PP  system  would  usually  be  easier  to  implement.  If  the  traffic  load 
had  a large  variance,  than  an  adaptive  system,  switching  between  the  two  acknowledge- 
ment schemes  would  be  beneficial. 

It  is  still  necessary  to  study  the  two  systems  when  some  of  the  assumptions  which 
were  used  are  removed.  Primarily  this  includes  the  lack  of  a time-out  period  and  the 
requirement  of  a symmetrical  network.  Such  analysis  is  planned  for  the  future. 
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EFFECTS  OF  MANPOWER  DEPLOYMENT  AND  ERROR  GENERATION  ON  SOFTWARE 
RELIABILITY 

M.  Shooman  and  S.  Natarajan 

Even  though  the  software  field  is  nearly  two  decades  old  it  is  only  now  that  the  at- 
tention is  focused  on  the  understanding  of  the  important  aspects  of  bug  generation  and 
removal  inherent  in  most  computer  programs.  In  this  contribution  we  shall  discuss 
the  dynamics  of  error  generation  and  error  removal  during  the  debugging  process.^ 

2 

Earlier  work  assumed  that  the  total  numbers  of  errors  (i.e.  , the  sum  of  those 
corrected  and  those  remaining)  in  a program  remained  constant  throughout  the  debug- 
ging  process.  This  leads  to  the  conclusion  that  if  the  debugging  continues  long  enough 
we  will  eventually  remove  all  the  errors.  In  practice  the  contrary  is  true.  Firstly, 
we  create  new  errors  during  debugging,  and  secondly  we  stop  the  debugging  process 
before  we  have  removed  all  errors.  Thus  the  total  number  of  errors  does  not  remain 
constant. 

We  will  list  some  of  the  ways  in  which  errors  are  generated: 

(1)  The  correction  of  a bug  may  work  locally  only  (i.e.  , the  global  aspects  of 
the  error  still  remain) 

(2)  A typographical  error  may  arise  as  the  result  of  bug  correction 

(3)  The  correction  is  based  upon  faulty  analysis  and  does  not  accomplish  any 
bug  removal. 

(4)  The  correction  is  accomplished,  however,  it  is  accompanied  by  the  creation 
of  a new  error. 

A.  Experimental  Debugging  Data 

Prior  to  the  formulation  of  an  error  model  we  shall  investigate  the  available  error 
data,  Shooman's  earlier  work  presents  error  correction  data  from  seven  different  pro- 
grams. These  are  portrayed  in  Figures  1 and  2.  Cumulative  error  correction  curves 
are  shown  in  Figures  3 and  4. 

3 

Shooman  and  Bolsky  report  the  results  of  a software  debugging  experiment.  One 
of  the  outcomes  of  this  experiment  is  shown  in  Figure  5.  We  observe  from  the  figure 
that  the  time  expended  on  fixing  the  earlier  and  the  later  discovered  bugs  is  the  same. 
This  leads  us  to  postulate  that  the  rate  of  correction  of  errors  is  constant,  counteract- 
ing the  general  belief  that  the  later  bugs  are  harder  to  correct.  Subsequently  we  shall 
show  that  the  use  of  the  constant  removal  rate  hypothesis  in  postulating  a model  leads 
to  anomalous  results.  (See  Ref.  1 for  a discussion  of  the  validity  of  the  data  in  Figure 

5.) 

Another  study  on  software  debugging  was  conducted  by  Akiyama.^  His  analysis  is 
shown  in  Figure  6.  If  we  examine  Figs.  3,  4 and  6 we  find  the  cumulative  correction 
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Fig.  3.  Cvimulative  error  curve  for  supervisory 
system  A given  in  Figure  1. 


Fig.  4.  Cumulative  error  curves  for  the  systems 
shown  in  Figures  1 and  2. 
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Working  Time  to.  Diognose 
ond  Correct,  Compared 


Fig.  S(a).  Working  time  expended  in  debugging 
as  observed  at  Bell  Labs. 


Computer  Time  to  Diagnose 
ond  Correct,  Compared 


Fig.  5(b).  Computer  time  expended  in  debugging 
as  observed  at  Bell  Labs. 
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MONTH 


Fig.  6.  Cumulative  curves  on  the  occurrence  of 
bugs  for  each  module  of  sample. 

NORMALIZED 

CUMULATIVE  ERRORS  DEBUGGED 


^Et/It 

[-ERRORS  REMAINING  | 

i 

ERRORS  CORRECTED 

T - MONTHS  OF  DEBUGGING 

(a)  APPROACHING  EQUILIBRIUM,  HORIZONTAL  ASYMPTOTE,  NO  GENERATION  OF  NEW  ERRORS. 


{ b)  APPROACHING  EQUILIBRIUM,  GENERATION  RATE  OF  NEW  ERRORS  EQUALS  ERROR  REMOVAL  RATE. 


Fig.  7.  Cumulative  errors  debugged  vs.  months  of  debugging. 


SAFETY.  REUABILITY  AND  SOFTWARE  ENGINEERING 


431 


curve  has  a high  initial  slope  and  gradually  decreases  as  debugging  proceeds.  We  will 
make  use  of  this  fact  in  our  hypothetical  model. 

B.  The  Removal-Generation  Error  Model 

We  shall  consider  three  cases  of  software  debugging: 

(1)  Error  generation  rate  < Error  correction  rate 

(2)  Error  generation  rate  = Error  correction  rate 

(3)  Error  generation  rate  > Error  correction  rate. 

These  three  cases  are  shown  in  Figure  7. 

Intuitively  we  may  feel  that  bug  generation  rate  is  proportional  to  the  num- 

ber of  errors  n(r)  remaining  in  the  program.  If  we  introduce  the  detection  rate,  r^(r) 
and  if  we  further  assume  that  not  all  detected  errors  are  corrected  we  obtain  Eqs.  (1) 

and  (2)  for  the  correction  rate  r (t)  and  the  generation  rate  r (t). 

c g 


r^(T)  = an(x)  r^(r) 

where  a is  the  proportionality  constant. 
r^(T)  = br^(T) 

where  b is  the  proportionality  constant. 

The  rate  of  bug  accumulation  is  given  by  Eq.  (3) 
dn(r)  _ 


(1) 


(2) 


dr 


rg(T)  - r^(T) 


(3) 


If  the  detection  rate  is  constant,  that  is  r ,(t)  = r where  r is  a constant  then 

d o o 


l^=an(T)r  -br 
dr  ' ' o o 


(4) 


The  solution  of  £q . (4)  is 


n(r)  = (n  - b/a)  e ° + b/a 


(5) 


where  n^  is  the  initial  number  of  errors  at  t = 0.  Equation  v5)  gives  rise  to  three  cases 
which  are  given  below. 

(1)  We  discount  the  case  where  n = b/a  because  the  probability  of  such  an  occur- 
rence is  very  low. 

(2)  If  n^  > b/a  the  errors  build  up  with  resulting  instability 

(3)  If  n^  < b/a  then  after  prolonged  debugging  we  find  from  Eq.  (5)  that  n(r)  de- 
creases and  becomes  negative  which  is  an  absurd  result."'”  In  summary  the 
following  two  conclusions  may  be  drawn. 


The  number  of  errors  can  not  be  a negative  quantity. 
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(a)  Either  the  assumptions  are  wrong,  or 

(b)  The  model  is  valid  for  only  a very  short  period. 

The  result  is  depicted  in  Figure  8. 


(b)  The  Case  Where  ng<b/a 

Fig.  8.  The  model  developed  with  the  assumption  that  the  correction 
rate  is  a constant,  while  the  generation  rate  is  proportional 
to  the  number  of  remaining  errors. 


If  we  let  a r^(T  ) = a ^ , then  Eq.  ( 1 ) reduces  to 

r^(T)=ajn(T)  (6) 

Let  us  assume  that  the  error  correction  process  follows  two  different  laws.  These  two 


laws  are  given  in  Equations  (7a)  and  (7b). 

bj  for  n(r)  > nj 

(7a) 

r_(T)  = { 

b2n(r)  for  n(r)<nj 

(7b) 

where  r.  j is  some  critical  number  of  bugs, 
we  get 

If  we  substitute  Eqs.  (6)  and  (7a)  in  Eq.  (3) 

^ = a.n  - b for  n(r)  > n, 

QT  11  1 

(8) 

Solving  Eq.  (8)  we  get  Eq.  (9) 
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n(T)  = (n^  - bj/aj)e  + b^/a^ 

once  again  by  substituting  Eqs.  (6)  and  (7b)  in  Eq.  (3)  we  get 

= a.n  - b,n  for  n(x)  < n, 
dr  1 — 1 

A solution  of  Eq.  (10)  is 

n(r)  = n^  exp[(aj  - b2)(r  - t^)] 

Equations  (9)  and  (11)  describe  the  error  behavior  throughout  the  debugging 
We  can  differentiate  among  three  cases  depending  upon  the  relative  values  c 
meters.  These  are 

Case  1.  > '^l^^l  resulting  in  instability 

Case  2.  n^  < b^/a^  and  a^  < b^  depicting  efficient  debugging 
Case  3.  < b^/a^  but  a^  > b^  leading  to  oscillations. 

For  Case  3,  because  a^  > b^,  the  errors  build  up  again  thereby  forcir 
rection  rate  to  obey  Equation  (7a).  Subsequently  the  error  behavior  is  desc 
Equation  (9).  These  three  cases  are  shown  qualitatively  in  Figure  9. 


"o 


(o)  Unsfoole  Model  Errors  bu'ld-up 
mdiscriminofely. 


(b)  Contrc"ed  Model  Debugging  is  efficient, 


(c)  Osciilotory  Model 

Fig.  9.  Remaining  errors  plotted  as  a function  of 
man-months  of  debugging. 
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In  Case  1 the  debugging  process  results  in  an  error  buildup.  Software  managers 
recognize  such  a problem,  occurring  unfortunately  on  many  projects.  The  cure  is  to 
brir ->  in  more  personnel  or  change  the  software  team.  Let  us  assume  that  this  occurs 
when  n = n*  at  time  T=  t'.  The  new  team  may  establish  new  values  of  a^  and  bj  for  a and 
b,  such  that  n^<  bj/a^.  The  number  of  errors  now  starts  decaying.  The  three  cases 
are  shown  in  Figures  10,  11  and  12. 


A computer  program  has  been  written  in  the  PL/1  language  to  plot  these  curves 
for  various  data.  The  program  plots  the  remaining  errors,  the  cumulative  corrected 
errors  and  the  total  errors. 

Further  study  is  being  conducted  on  the  economics  of  debugging  for  Case  1.  The 
analysis  will  contrast  the  cost  of  debugging  versus  the  cost  of  rewriting  the  program. 
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Debugging  Effort  m Mon -Months 

Fig.  11.  Case  2.  The  controlled  model. 


Fig.  12.  Case  3.  Oscillatory  model.  Oscillations 
controlled  after  the  first  bump. 
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STATISTICAL  THEORY  OF  PROGRAM  TESTING 
A.E.  Laemmel 


There  are  three  methods  for  maximizing  the  reliability  of  computer  programs; 

(1)  use  a systematic  procedure  which  makes  it  difficult  for  errors  to  occur  during  the 
writing  of  the  program,  (2)  prove  that  the  program  works  correctly  by  some  formal  or 
automatic  process,  and  (3)  test  and  debug  the  program  thoroughly  before  passing  it  on 
to  the  user.  Most  programmers  will  use  some  combination  of  these  methods,  and  in 
fact  some  procedures  involve  elements  of  more  than  one  method.  The  present  report 
emphasizes  testing,  but  first  some  remarks  will  be  made  about  program  writing  and 
proving. 

(1)  Writing  Correct  Programs:  While  no  one  intentionally  inserts  errors  in  his 
program,  it  is  undoubtedly  true  that  many  people  would  produce  more  reliable 
programs  with  less  effort  if  they  were  taught  better  programming  techniques. 
However,  it  seems  obvious  that  the  average  programmer  should  test  his 
programs  even  if  he  exercises  the  maximum  of  care  and  uses  the  best  tech- 
niques. 

(2)  Proving  Program  Correctness:  There  are  several  reasons  why  a formal  pro- 
cedure  for  proving  program  correctness  cannot  be  relied  on  to  insure  absence 
of  errors  in  practical  situations:  (i)  a uniform  algorithm  for  proving  the 
correctness  of  an  arbitrary  program  can  be  shown  to  be  impossible,  being 
essentially  equivalent  to  Turing's  halting  problem;  (ii)  even  for  solvable  sub- 
classes of  the  correctness -proving  problem,  the  usual  method  (some  improve- 
ment on  Herbrand  search)  is  so  time  consuming  as  to  be  impractical;  (iii) 
there  is  always  a possibility  of  error  in  the  proving  program,  or  in  applying 

it  to  the  program  being  tested. 

(3)  Testing  Computer  Programs:  In  view  of  the  difficulties  of  validating  a com- 
puter  program  by  programming  techniques  or  formal  proof  methods,  it  is 
believed  that  some  amount  of  testing  will  always  be  necessary.  The  purpose 
of  this  report  is  to  describe  a model  which  shows  the  relationship  between 
errors  of  different  types  and  the  probability  that  they  will  cause  a program 
to  fail,  and  also  to  suggest  optimum  testing  methods  which  minimize  the 
probability  of  program  failure. 

A fairly  detailed  sketch  of  the  proposed  model  is  given  below. 

A.  Definitions 


Some  basic  aspects  of  the  testing  process  apply  equally  well  to  a computer  program 
or  to  a physical  device.  For  this  reason  the  program  or  device  being  tested  will  be 
referred  to  simply  as  the  module.  After  the  module  has  constructed  it  is  checked  by  a 
tester  and  then  employed  by  a user.  The  probability  of  error,  P^,  which  occurs  during 
use  is  given  by 


P = P P 
e mu 


(1) 


where  P^  is  the  probability  that  the  tester  misses  all  of  the  residual  bugs  in  the  module. 
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and  is  the  probability  that  the  user  then  encounters  one  of  the  overlooked  bugs.  If 
exhaustive  testing  is  possible  the  = 0,  since  it  is  assumed  that  if  a bug  is  found  it 
is  corrected  and  the  whole  process  is  started  again.  In  most  cases  of  interest  exhaus- 
tive testing  is  not  possible  or  practical.  If  there  are  no  bugs  then  all  of  the  probabilities 
are  zero  and  this  possibility  appears  as  a special  case  in  the  analysis  to  follow. 

B.  Elementary  Model 

A simply  case  might  be  the  following:  The  module  has  N possible  input  values, 
and  each  of  these  are  equally  likely  to  be  chosen  by  the  tester  or  by  the  user.  Of  these 
input  values,  W cause  improper  functioning  of  the  module,  but  neither  the  tester  nor 
the  user  knows  which  inputs  cause  errors  or  even  how  large  W is.  The  tester  chooses 
t inputs  at  random  without  keeping  a record  of  inputs  previously  tested,  i.e.  , sampling 
with  replacement.  Under  these  circumstances,  P^  = W/N,  P^  = (1  - W/N)*  and 


W ,,  W .t  ^ W -Wt/N 
N -N  ) 1 N ® 


(2) 


A plot  of  Pg  vs.  W is  displayed  in  Fig.  1.  If  the  testing  is  to  do  any  good,  i.e.  , to  re- 
duce Pg  to  significantly  less  than  W/N,  then  it  is  necessary  that 


t » 


w 


(3) 


In  many  applications  it  is  found  that  satisfying  the  inequality  of  Eq.  (3)  requires  a very 
large  number  of  tests  t,  and  that  our  intuitive  feeling  is  that  P^  is  acceptably  small  in 
spite  of  the  testing  using  far  fewer  tests  than  indicated. 
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Fig.  1.  Graphical  representation  of  relationships 
among  bugs  and  tests. 
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It  will  continue  to  be  assumed  that  the  tester  passes  no  information  to  the  user 
about  which  inputs  were  tested.  It  will  also  be  assumed  that  each  test  either  succeeds 
or  fails;  there  is  no  additional  information  which  would  permit  sequential  sampling  meth- 
ods. Sampling  without  replacement  would  slightly  lower  to 


P 


m 


(N-W)t  (N-t)! 


t < N-W 
t > N-W 


(4) 


Note  that  this  method  requires  that  the  tester  keep  a record  of  inputs  already  tested, 
or  that  he  avoids  duplication  by  other  means. 


The  particular  type  of  error  to  which  P^  pertains  must  be  borne  in  mind  to  avoid 

confusion  with  other  possible  definitions  of  error.  If  W = N all  inputs  to  the  module 

cause  malfunctions  but  P^  = 0 according  to  Equations  (2)  and  (4).  This  is  so,  because 

the  tester  removes  user  bugs  with  each  test,  and  continues  until  all  tests  are  exhausted. 

In  this  case  the  tester  always  rejects  the  module  (provided  only  that  t > 0)  and  so  the 

user  cannot  experience  a malfunction.  Note  that  P^  is  neither  the  probability  {user  has 

a malfunction}  nor  the  probability  {user  has  a malfunction  | tester  accepts  module}. 

Rather,  P^  is  the  probability  {user  has  a malfunction  and  tester  accepts  module}.  If  the 

number  of  tests  is  fixed  at  t,  and  if  testing  with  replacement  is  done,  then  the  value  of 

of  W which  maximized  P is 

e 


(5) 


From  Eq.  (5)  or  Fig.  1,  it  can  be  seen  that  as  W 0 then  Pg  ^ 0 also.  This  is  so  be- 
cause for  small  W there  is  a small  chance  that  the  user  will  encounter  a faulty  input. 
Similarly  as  W N then  also  P^  0 because  there  is  little  chance  that  the  tester  will 
accept  the  module.  Of  course,  the  last  case  is  undesirable  for  reasons  other  than  the 

value  of  P , i.e.  , the  user  has  a small  probability  of  receiving  a released  program, 
e 

C.  Model  with  Unequal  Probabilities 

The  model  described  in  the  preceding  section  is  too  simple  to  apply  to  most  prac- 
tical testing  situations:  some  inputs  are  more  likely  to  fail  than  others,  and  the  proba- 
bility of  one  input  failing  may  not  be  independent  of  another  input  failing.  Often  a single 
bug  may  cause  many  inputs  to  fail.  The  tester  may  not  choose  the  inputs  to  be  tested 
randomly,  but  rather  in  such  a way  as  to  utilize  his  knowledge  of  the  a priori  failure 
probabilities.  The  user  may  not  be  free  to  choose  more  reliable  inputs;  in  fact,  he  may 
be  constrained  by  the  problem  to  use  less  reliable  inputs.  The  model  to  be  described 
here  includes  three  events,  the  last  two  being  independent  of  each  other  but  dependent 
on  the  first. 
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(1)  A programmer  constructs  a module  which  has  an  error  pattern  a with  proba- 
bility F(a).  a might  be  a binary  vector  (aj,a2 qn)  with  Qi  = 1 meaning 

input  i fails  and  = 0 meaning  input  i functions  correctly. 

(2)  A tester  tries  certain  inputs  to  the  module  and  accepts  the  module  with  proba- 
bility R(accept|o).  The  tester  passes  no  information  concerning  which  in- 
puts were  tested  to  the  user. 

(3)  A user  selects  one  of  the  inputs  and  the  module  fails  with  probability  Cl  (fail  | a). 
The  error  probability  defined  previously  is  now  given  by 

P = ^ P(“)  ©(failja)  R(acceptla)  (6) 

® alA 

where  A is  the  set  of  all  possible  error  patterns. 

N 

To  illustrate,  if  there  are  N input  values  then  A consists  of  2 elements.  Assume 
R(a)  is  0 for  all  A except 

Uy,  = (1,1 1.  0,0,.  ..,0) 

" •_  W — ► ♦-N-W—* 

and  let  the  probability  of  the  user  selecting  input  i be  q^.  Then 
W 

Q(failla.^)  = ^ 
i=l 

* 

Assume  the  tester  selects  his  inputs  randomly,  choosing  input  i with  probability  r^. 

Then 


W 

IKacceptja  ) = TT  (1 
j=l 


w w 

i=l  j=l 


(7) 


If  = i/N  and  r ^ = t/N  this  reduces  to 

P = ^(1 
e N ' N ' 


(8) 


Another,  more  useful,  form  for  P(a)  is  obtained  by  assuming  that  the  i input  malfunc- 
tions with  probability  Pj^.  Then 


The  formula  also  apply  if  r^  is  given  the  interpretation  "input  i is  tested  and  the 
response  is  noted  to  be  wrong  by  the  tester.  " Specifically  rj^  = 0 might  mean  either 
that  input  i was  not  tested,  or  that  it  was  tested  and  an  error  was  not  detected. 
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N Q.  1 - a. 

Kq)  = fT  p/(l  -p.)  ^ 

i=l  ^ ^ 

N 

Q(fla)  = y,  <i;q 


j=i 


j J 


N a. 

• IKaja)  = IT  (1  - r.)  ^ 
i=l 


J 


The  two  products  can  be  combined  in  evaluating  P^: 


N N 

P = y y y y a.q.  w 

e Li  Li  Li  U , i'  i' 

“i“2“n  j=l 


where 


a.  1 - Q.  a. 

F.(q.)  = p.  ^(1  - p.)  ^(1  - rJ  ^ 


1 1 

This  reduces  to 
N 


P = 
e 


N N p.(l-r.) 

J=1  •*1=1  *^11 


(9) 


If  the  tester  selects  t inputs  deterministically,  and  if  the  inputs  are  permuted  so  that 
these  occur  first,  then 


t N 

P^=7T(1-Pj)  ^ P^q^ 

J=1  ^ i=t+l 


(10) 


An  optimum  testing  strategy  is  obtained  if  the  inputs  are  permuted  so  that  this  expres- 
sion is  minimized.  The  first  factor  suggests  testing  inputs  with  the  largest  p^,  but  the 
second  factor  suggests  testing  inputs  with  the  largest  P^q^-  I-«t  the  above  expression 
be  abbreviated  as  P^  = and  note  that  P^  is  the  probability  of  the  user  getting  an 

error  on  an  untested  input.  Consider  the  effect  of  adding  one  more  input  to  the  test. 
Let 

Pe'=  Pm<^-Pk><Pu  -Pk'lk’ 


P'  P (1 
e e 


(^k  + ^u’Pk 


Thus  the  criterion  is  to  select  the  input  with  the  largest 

K + PJPk 


(11) 


lihUAttiiiliiiii 
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As  can  be  seen,  this  provides  a weighted  compromise  between  selection  on  the  basis  of 
D.  Model  with  Statistical  Dependence 

Computer  programs  usually  fail  for  a whole  set  of  input  values  as  a result  of  a 
single  bug.  It  is  more  realistic  to  assume  that  failures  in  different  parts  of  the  program 
are  statistically  independent  rather  than  failures  for  different  input  values.  For  ex- 
ample, a single  oversight  might  cause  a square  root  program  to  fail  for  all  negative 
numbers.  Let 


.th 


, 1 if  the  j bug  causes  failure  for  input  i 
T. . = ■[ 

'■Q  if  the  j ” bug  doesn't  affect  input  i 


If  p.  is  the  probability  of  the  j bug,  if  M is  the  number  of  possible  bugs,  and  if  6. 

J ■* 

(j  = 1,  2,  ...  ,M)  is  the  pattern  of  actual  bugs,  then 


^ 9 


(12) 


This  is  analogous  to  the  corresponding  formula  in  a given  above.  Here, 

M 0.  1-9. 

P(0)  a 7T  p "(1  - p ) 
i=l 


N r M 


(13) 


M N 

H(a|9)  = 7T  TT  (1-r  ) 
i=lf=l 


a,  i0. 
f 1 


Combining  and  rearranging  gives 
N 


M 


P,'  Sij  VJ/i'V 


where 


9.  I o ^ 

F.(9.)  = P/(1-P.)^‘®  TT  (1-r  ) ^'  " 
111  1 ^ 


and 


M 


A;(e, 0j^)  = 1 - TT  (1  - 0,^9^) 


j"'l 


k=l 


jk'^k' 
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The  summations  over  6^  are  for  only  two  values  (0,1)  and  can  be  carried  out  to  give 

M “ N a .1  N M r N a ~ 

P = 7T  1 - P + P 7T  (1  - r ) ^ " - V q 7T  1 - p + (1  - a )P  7T  (1  - r ) ^ ^ 

® i=lL  .«=1  J j=l  ^f=l  ^ 


If  deterministic  testing  is  used  (r=  0,1)  over  the  first  t inputs: 


TT'd-P)  y;  q l-7T"(l-a.. 
i=l  j=1+l  ■' 


where  W'  refers  to  a product  over  terms  involving  a value  of  i such  that  = 1 for  at 
least  one  value  of  j in  the  range  j = 1,  2,  . . . ,t,  and  where  IT"  refers  to  the  other  values 
of  i.  The  bugs  can  be  permuted  so  that  1 < i < T implies  a ^ = 1 for  at  least  one  value 
of  j from  1 to  t and  t < i < M implies  = 0 for  all  values  of  j from  1 to  t.  Then 

M T MM 

7T'=  7T  and  7T' = V 

i=l  i=l  i=l  i=T+l 

The  problem  is  then  to  minimize 


T N M 

P^=  iru-u^)  V 1.  ,1-0  p.) 

1=1  j=t+i  L i='r+i 


A graphical  interpretation  of  the  above  is  portrayed  in  Fig.  2,  and  illustrates  how 
different  inputs  excite  various  bugs.  For  example,  input  2 excites  bugs  numbered  1,2, 

3 and  4.  Bug  numbered  M can  only  be  discovered  through  the  application  of  inputs 
N-3,  N-2,  N-1  and  N. 


Pg  = ERROR  PROBABILITY 


N = NUMBER  OP  INPUT  VALUES 
W=  INPUT  VALUES  CAUSING  MISFUNCTIONING 
t = NUMBER  OF  INPUTS  TESTED 

Wq=  worst  value  of  W 


Fig.  2.  Plot  of  P vs.  W from 

the  model  of  Equation  2. 
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E.  Illustrative  Special  Cases 

Some  simple  special  cases  will  illustrate  the  above  formula,  and  are  useful  in 
getting  a rough  idea  of  the  relation  between  number  of  tests  and  probability  of  error. 
Suppose  that  each  of  the  M bugs  occurs  with  the  same  probability  of  error,  P,  and  affect 
the  same  number  of  inputs,  b.  Suppose  further,  that  the  pattern  of  input  errors  is  the 
most  difficult  to  detect  with  a given  number  of  tests,  t,  i.e.  , that  the  error  subsets 
are  as  "disjointed  as  possible.  " If  the  tests  are  distributed  most  effectively  over  the 
inputs,  each  test  will  detect  Mb/N  bugs.  This  will  result  in  M -Mbt/N  undetected  bugs, 
and  T = Mbt/N  (assuming  bt  < N). 

(16) 

Usually  t « N and  the  middle  term  can  be  neglected.  For  the  testing  to  be  effective 
their  number  must  be  larger  than  a critical  value  given  by 


M 


N 


(1-^) 


1 - (1  -P) 


M(1  - 


which  were  given  a 
smalles  value 


(17) 


The  latter  is  independent  of  p because  the  testing  is  exhaustive. 

F.  Alternative  Definitions  of  Error 

The  definition  of  error  which  was  used  above  may  not  be  suitable  for  all  purposes, 
but  related  probabilities  can  easily  be  calculated  from  the  equations  given.  Define  two 
events  as  follows: 

E is  the  event  that  a user  employs  the  module  once  and  encounters  a failure. 

E is  the  event  that  a tester  operates  the  module  t times  and  does  not  encounter 
m 

a failure,  i.e.,  the  tester  accepts  the  module.  The  subscript  m is  mnemonic  for 
"missed,"  since  bugs  are  always  assumed  to  be  present  with  some  probability,  and 
therefore  probability  that  the  tester  has  missed  all  bugs,  i.e.,  incorrectly 

accepted  the  module. 

Three  different  quantities  which  might  be  interpreted  as  the  probability  of  user 
error  are  tabulated  below  for  the  various  models  which  have  been  analyzed.  The  ef- 
fectiveness of  the  testing  process  can  be  judged  by  comparing  the  first  with  the  last  two. 
Note  that  the  last,  P{E^|E^},  is  always  greater  than  the  second,  P{E^E^]  = P^.  This 
fact  might  lead  to  a choice  of  the  conditional  probability,  P^  [E^jE^}.  in  some  cases 

where  P is  small  due  to  tester  rejection  and  in  spite  of  a large  probabiUty  of  module 
e 


bugs. 
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G.  Conclusions  and  Comrnents 

It  is  believed  that  defining  what  is  meant  by  the  probability  of  a program  error, 
and  presenting  a model  which  permits  its  exact  calculation  (Eq.  15)  will  provide  the 
nucleus  around  which  a theory  of  software  reliability  can  be  built.  The  purpose  is  not 
merely  to  get  a formula  into  which  numbers  can  be  plugged  to  give  a probability  of  er- 
ror --  this  does  nothing  to  reduce  errors.  Rather  it  is  anticipated  that  by  classifying 
different  types  of  bugs,  errors  and  test  results,  and  by  showing  how  they  interact  pro- 
gramming systems  can  be  improved  and  optimum  testing  procedures  can  be  found. 

Table  I summarizes  the  models  discussed. 

TABLE  I.  Summary  of  the  statistical  test  models. 


Probability  of 

user  failure 

user  failure  and 
acceptance 

user  failure,  given 
acceptance 

Model  1 

Pr  {E  } 
1. 

P = Pr  fE  E } 
e u m 

Pr (E  1 E ] 
^ u ' m 

Equal 

failure 

probabilities 

W 

Pi  " N • 

W 

N 

w,  w.t 
N 1 N ' 

w 

N 

Unequal 

failure 

probabilities 

M=N,  a. . = 6.. 
Ji 

N 

E q.p. 
—1  ^11 

i=l 

N t 

E ViTrU-Pj) 
i=t+l  J=1 

N 

V q p 
U ^it’i 
i=t+l 

(sum  over  untested  inputs) 

With 

statistical 

dependence 

N 1 M 

y q.|  1 - 7T  (l-a..p.) 

j=i  ■'L  1=1  J 

see  Eq.  ( 1 5) 

N 

S “J 

j=t+l 

M 

1-  7T  (l-a..p.) 
, 1 1 
_ i=r+l 

Special  case 
of  above 

1 - (i-P)^ 

see  Eq.  ( 1 6) 


i-(i-P) 

3 TYPES  OF  ERROR  PROBABILITY 
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STATISTICAL  THEORY  OF  PROGRAM  TESTING  AND  PROVING 
A.E.  Laemmel 

The  purpose  of  this  work  is  to  extend  results  on  program  testing  which  were  given 
in  Ref.  1 and  to  discuss  the  efficiency  and  statistics  of  combined  testing  and  proving 
programs.  A program,  call  it  an  analyzing  program,  which  is  designed  to  test  and/or 
prove  another  (subject)  program  is  prone  to  many  practical  and  theoretical  difficulties. 
In  fact,  if  one  ponders  the  process  of  testing  and  proving  the  analyzing  program  itself, 
one  quickly  arrives  at  the  conclusion  that  a perfectly  general  version  cannot  exist. 
Nevertheless,  computer  programmers  are  constantly  being  given  subject  programs 
about  which  they  must  say  something,  and  subject  programs  are  constantly  given  to 
users  without  being  completely  tested  or  proved  correct.  Note  that  we  must,  for  prac- 
tical reasons,  leave  open  the  possibility  that  part  of  the  analyzing  program  is  carried 
out  inter  actively  by  the  human  operator. 

More  specifically,  it  will  be  shown  how  better  analyzing  programs  can  be  written 

by; 

(1)  Using  several  criteria  for  optimum  search  strategies  to  minimize  the  proba- 
bility of  indecisive  outcomes 

(2)  Imitating  human  reasoning  processes,  especially  in  seeking  proof 

(3)  Using  previously  reported  statistical  results  to  provide  more  realistic  proba- 
bilities for  (1)  above. 

Before  treating  testing  and  proving  programs  a class  of  "decision  programs"  will 
be  discussed.  These  differ  from  most  computer  programs  in  what  the  answer  is  not 
"105.3"  or  "yes"  -or-  "no,"  but  rather  "yes"  -or-  "no"  or  "I  don't  know."  The  latter 
case  of  indecision  is  not  a result  of  inability  in  programming  ability,  rather  it  is  intrin- 
sic in  the  problem. 

A.  Decision  Programs 

For  the  purpose  of  the  present  discussion,  a decision  program  will  be  defined  as 
one  which  examines  a given  object  or  set  of  objects  and  which  outputs  one  of  several 
messages.  This  definition  is  purposely  too  general  (it  includes  almost  every  program), 
but  it  will  be  narrowed  by  example  and  by  definition  below.  It  does  not  seem  desirable 
to  exclude  programs  such  as  one  for  calculating  tan(0);  these  can  meet  the  definition  as 
degenerate  cases.  An  example  which  illustrates  the  idea  of  a decision  program  is  one 
which  examines  an  initial  string  such  as  000101  and  which  outputs  the  result  of  applying 
Post's  tag  rules  (0.  .S->500,  1.  .S  — 51101).  Possible  outputs  might  be: 

(1)  THE  STRINGS  VANISH  AFTER  n ITERATIONS 

(2)  THE  STRINGS  ULTIMATELY  REPEAT  WITH  A PERIOD  p 

(3)  ITERATION  LIMIT  t EXCEEDED 

(4)  MEMORY  ALLOCATION  m EXCEEDED. 
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A SNC30L  program  which  does  this  is  given  in  Section  C.  Note  a characteristic 
feature  of  most  decision  programs  is  outcomes  (3)  and  (4).  They  say  essentially  that 
no  decision  was  reached  within  the  allocated  bounds. 

Another  example  of  a decision  program  is  a testing  program  which  examines  an- 
other subject  program.  Possible  outputs  here  might  be; 

(1)  SUBJECT  PROGRAM  FAILS  ON  INPUT  x 

(2)  SUBJECT  PROGRAM  OPERATED  CORRECTLY  ON  n DIFFERENT  INPUTS 

Note  here  that,  except  for  the  very  simplest  subject  programs,  to  expect  an  output  such 
as 

(2a)  SUBJECT  PROGRAM  WILL  ALWAYS  OPERATE  CORRECTLY 

is  both  theoretically  and  practically  unreasonable.  One  can  make  outcomes  (3)  and  (4) 
in  the  first  example  very  unlikely  by  setting  t and  m large  enough,  but  there  is  no  way 
to  calculate  values  of  t and  m ahead  of  time  which  will  permit  (3)  and  (4)  to  be  replaced 
by 

(3a)  THE  STRINGS  GROW  WITHOUT  BOUND 

We  are  thus  faced  with  the  fact  that  in  many  cases  of  interest,  some  of  the  possible 
outcomes  of  a decision  program  have  the  practical  interpretation  "NO  DECISION  COULD 
BE  REACHED.  " 

Proof/counter  example  model.  A special  case  of  a decision  program  occurs  very 
frequently  in  various  fields  of  mathematics  and,  in  particular,  it  describes  the  situation 
in  computer  program  testing.  Abstractly,  a formula  of  the  form 

(x)  T(x) 

is  given,  meaning  that  for  every  x in  the  universe  of  discourse  the  predicate  T is  true 
for  the  subject  x. 

Two  examples  are; 

Example  1;  (x)  means  "all  even  integers  greater  than  2"  Goldbach's  conjecture 
T(x)  means  "x  is  the  sum  of  two  primes.  " 

Example  2;  (x)  means  "all  non-negative  integers"  program  testing  T(x)  means 
"the  program  correctly  calculate  x!  " 

The  abstract  model  will  now  be  enlarged  to  provide  a convenient  formalism.  We 
are  given  (x)  T(x)  and  are  to  say  that  it  is  either  correct  or  incorrect,  i.e.  , one  of  the 
following : 

MjM,  . . . M^LjL^  . . . L^  I- (x)  T(x)  (1) 

(3y)  T(y) 


(2) 
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The  first  is  a symbolism  for  the  fact  that  a proof  has  been  found  for  (x)  T(x)  starting 
with  algebraic  axioms  M.M^  . . . M^,  logical  axioms  LjL2  * • ' ^m’  using  accepted 
rules  of  inference.  The  second  is  simply  a statement  that  a subject  y has  been  found 
for  which  T(x)  is  false,  i.e.  , a counter  example.  Note  that  the  axioms  are  called 
"algebraic"  merely  to  distinguish  them  from  the  logical  axioms;  they  pertain  to  what- 
ever field  the  Theorem  T is  in,  e.g.  , number  theory,  analysis,  program  translation, 
etc . ' 

In  mathematics  one  is  often  faced  with  alternately  trying  to  prove  a suspected 
theorem,  i.e.,  to  alternately  working  on  the  possibilities  of  outcomes  (1)  and  (2)  above. 
Of  course,  there  is  also  a third  possibility  that  neither  (1)  nor  (2)  has  been  established 
within  the  allocated  time  and  resource  limits.  The  best  that  can  be  done  is  to  try  to 
minimize  the  probability  of  this  third  "NO  DECISION"  outcome,  to  minimize  the  expect- 
ed cost  to  a definite  decision,  or  to  optimize  with  respect  to  some  similar  parameter. 

Black’s  strategy.  The  simplest  version  of  the  situation  described  above  might  be 
rephrased  in  traditional  terms  as  follows:  a single  red  ball  is  in  one  of  two  boxes,  each 
of  which  contains  a very  large  number  of  white  balls.  Let 

p.  = a priori  probability  that  red  ball  is  in  box  i 

m^=  probability  that  if  red  ball  is  in  box  i a single  look  will  miss  it 
Cj^  = cost  of  a single  look  in  box  i 

Black^  has  given  a simple  way  to  find  the  minimum  expected  cost  strategy  for  searching 
for  the  red  ball.  Arrange  the  ntimbers 

p.  mf  ( 1 -m. ) / i = 1,2 

c^  1 n = 1 , 2,  3,  . . . 

in  decreasing  order.  If  the  k-th  number  in  this  arrangement  is  one  with  i = 1 then  the 
k-th  look  should  be  in  box  i,  otherwise  in  the  other  box. 

In  order  to  get  a feeling  for  this  strategy  in  the  present  application,  let  box  1 
represent  possible  proofs  that  a certain  computer  program  works  correctly  and  box  2 
represent  various  sets  of  input  data. 

Choose  parameters: 
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I 


= . 9 <- (program  probably  OK)  ->  p^  = . I | 

i 

m,  = .999-1  computer  search  for  m_  = .2  ^ one  3 

1 1 ' 

Cj  = 100  a proof  time-consuming  c^  = 5 program  test  is  eas-y  | 

and  not  likely  to  succeed  j 

P.O  -m,) 

—  — = .000009 

^1 

p,(l  -m,)m 

—  i = . 00000899 

c 

1 

p,(l  -m,)m^ 

—  i = .00000898 

‘^l 

. 00000897 
. 00000896 
. 00000895 

According  to  this,  looks  1,2, 3, 4 and  5 should  be  in  box  2,  then  looks  6,7,8  (for  several 
hundred  looks)  should  be  in  box  1.  This  strategy  simply  says  to  look  in  the  box  which 
has  the  highest  ratio  of  a posteriori  probability  to  cost  at  any  stage.  It  is  intuitively 
satisfying  that  the  first  looks  were  for  counter  examples,  and  (perhaps)  that  the  sixth 
look  was  for  a proof.  It  is  not  too  satisfying  that  these  looks  are  to  be  followed  by  a 
attempts  at  a proof,  where  a is  given  by 

. 000009m  j = . 00000512 

. 57  = (1  - . 001)*^  » 1 - . 001  a 

If  the  first  series  above  decreased  more  rapidly,  and  the  second  series  decreased 
more  slowly,  the  strategy  would  go  back  and  forth  between  proof  and  counter  example 
more  frequently.  The  main  difficulty  is  the  successive  trials,  either  for  proof  or 
counter  examples,  are  not  statistically  independent  as  is  required  in  Black's  model. 

Each  attempt  at  a proof  can  build  on  the  last  attempt  because  any  partial  result 
obtained  in  the  previous  attempt  can  be  used  in  the  next.  This  means  that  m^  should 
decrease  with  n instead  of  remaining  constant.  On  the  other  hand,  if  a computer  pro- 
gram is  tested  with  10  random  inputs,  and  if  no  failure  occurs  for  the  first  9,  then  the 
10th  trial  certainly  gives  less  information  than  the  1st  trial.  This  means  that  m2  should 
increase  with  n,  i.  e.  , that  errors  are  harder  to  find  with  a single  test  later  on  in  the 


P2(l  -m2) 


= . 016 


P2(l  -m2)m2 


= .0032 


P2(l 


= .00064 


00128 

0000256 

00000512 
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testing  sequence.  Such  a result  can  be  derived  from  formulas  given  previously,  since 
these  represent  a way  to  describe  statistical  dependence  among  test  outcomes. 

B.  Automatic  Search  for  Program  Proofs 

In  the  process  described  above  it  is  easy  to  see  how  a computer  can  be  program- 
med to  look  for  counter  examples,  but  how  is  a computer  to  look  for  a proof  of  a formula 
such  as 


(x)  T(x) 


For  infinite  universe,  and  even  for  large  finite  universes,  the  falsity  can  be  established 
by  a single  value  for  which  T(Xj^)  is  false,  but  no  number  of  x^^  for  which  T(Xj^)  is  true 
can  establish  truth  for  all  values.  At  least  three  ways  have  been  explored  to  circum- 
vent this  difficulty: 

(1)  In  the  Herbrand  search  method^  single  instance  of  a more  complicated  formula 
serves  to  prove  T(x)  for  a^  x.  The  difficulty  here  is  that  this  single  instance  is  very 
hard  to  find  much  more  so  than  a typical  instance  which  causes  T(x)  to  be  false.  Note 
that  a basic  cause  of  this  difficulty  is  that  the  search  is  for  a proof  of  the  program  and 
the  algorithm  it  is  based  on. 

(2)  A related  search  procedure  has  been  studied  which  converges  to  x T(x)  by 
"successive  approximations,  " i.e.  , a weaker  theorem  is  first  established,  and  then  it 
is  strengthened  at  each  step. 

In  certain  cases  a large  number  of  true  instances  might  establish  a certain  sta-  ■! 

tistical  credability  for  x T(x).  While  this  "proof  by  statistics"  is  never  accepted  by 

mathematicians,  it  is  almost  universally  accepted  by  programmers.  i 

(3)  A good  way  to  proceed  in  practice  might  be  to  use  the  computer  for  the  counter 

example  searches,  and  the  computer  operator  for  proof  attempts.  This  interactive  ap-  ; 

proach  would  use  the  computer  only  for  what  it  does  most  efficiently:  tediously  check-  ; 

ing  out  many  test  cases;  and  the  human  for  what  he  does  best:  inventively  selecting  the 
steps  in  a demonstration.  Certain  parts  of  the  proof  procedures  which  involve  searches 
might  also  be  carried  out  by  the  computer. 

Four  sections  follow  which  explain  (Section  1.)  the  inductive  assertion  method  of 
program  proof.  This  leads  to  a theorem  to  be  proved,  usually  by  (Section  2.)  Herbrand  , 

search.  Here  the  connection  between  Herbrand  search  proofs  and  mathematics  "text- 
book proofs"  is  shown  in  the  hope  of  improving  the  efficiency  of  the  former  (Section  3. ) 
an  alternate  method  for  generating  proofs  by  computer  is  sketched.  Finally  (Section  4.) 
a method  for  optimizing  the  required  decision  programs  is  given.  i 
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1.  Inductive  Assertion 


This  method  of  proving  program  correctness,  as  distinguished  from  merely  test- 

2 

ing  the  program  with  certain  input  data,  is  usually  called  Floyd's  Method.  A brief 
history  of  the  method,  an  example,  and  a discussion  of  its  basis  in  mathematical  induc- 
tion will  be  found  on  pgs.  11-21  of  Knuth's  first  volume  on  programming?  Good  discus- 
sions and  reviews  can  be  found  in  Elspas  et.al.  ^ and  London.  Consider  the  following 


very  simple  example: 


Note:  Add  n > 0 to  all 

sets  of  assertions 
and  also,  strictly 
that  all  variables 
are  integers. 


1 < k < n 
y = (k-1)'. 


This  example  computes  the  factorial,  but  the  method  applies  to  any  program  which  can 
be  reduced  to  a flow  chart  with  the  following  two  types  of  boxes 


Here  each  of  S,U,R,W,B,C  and  D are  predicates  (or  usually  sets  of  predicates)  which 
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make  assertions  about  the  variables  x,  y.  These  variables  take  on  various  values  during 
program  execution  and  for  any  given  set  of  values  each  predicates  S.  . . D in  true  (T)  or 
false  (F).  In  particular  C true  or  false  determines  the  exit  from  the  conditional  branch 
box.  The  symbol  *(x,y)  represents  the  action  of  executable  statements  in  the  program 
on  the  variables,  but  it  too  can  be  regarded  as  a predicate  which  states  that  the  stated 
action  has  been  carried  out. 

It  is  required  that  the  assertions  at  the  output  of  a square  box  be  implied  by  the 
assertions  on  one  arrow  or  the  other  entering  the  box  and  operation  performed  within 
the  box; 

[S(x,y)  + U(x,y)]  *(x,y)  D R(x,y) 

One  is  not  to  be  confused  by  the  or,  this  is  equivalent  to 
S(x,y)  $(x,y)D  R(x,y) 

and 

U(x,y)  $(x,y)  D R(x,y) 
for  the  decision  box  we  must  have  both 
W(x,y)  C(x,y)  3)  D(x,y) 

and 

W(x,y)  ^(x,y)D  B(x,y) 

In  this  description  x represents  input  data  and  y program  variables.  Note  that  do-nothing 
[~~|  boxes  can  always  be  added  to  avoid  more  than  two  inputs  to  a □ box,  or  to  avoid 
more  than  one  input  to  a box. 

The  algorithm  is  proved  if  the  following  four  conditions  are  met: 

(1)  A set  of  assertions  satisfying  the  above  conditions  is  formed 

(2)  The  assertions  on  the  start  arrow  only  put  known  conditions  on  the  input 
variables 

(3)  The  assertions  on  the  halt  arrow  state  that  the  desired  answer  has  been 
obtained 

(4)  The  algorithm  terminates,  i.e.  , it  actually  reaches  the  halt  arrow. 

The  last  condition  is  essential,  yet  as  was  shown  by  Turing  there  is  no  algorithm 
to  test  if  (4)  is  true  or  false.  However,  in  many  practical  problems  halting  can  be 
shown,  e.g.  , in  the  factorial  program  above  k starts  at  1,  increases  by  i each  time 
around  the  loop,  and  a halt  occurs  when  k > n > 0.  Also,  finding  a consistant  set  of 
assertions  might  require  considerable  skill  for  a complicated  algorithm.  The  theorem 
to  be  proved  is,  in  this  sense,  not  usually  known  at  the  start,  and  the  theorem  proving 
method  of  Section  3 below  can  help  here. 


r 
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2.  Herbrand  Search 

It  is  desired  to  determine  if  a suspected  Theorem  T can  be  inferred  from  a set 
of  axioms  A^.A^, . . . i where  all  statements  are  quantities  formulas  of  the  predicate 
calculus.  Godel's  completeness  theorem  states  essentially  that  “Aj,  A^. . . . , A^  |— T if 

and  only  if  there  is  no  model  for  A^,  A^ A^  and  T.  " Trying  to  establish  that  there 

is  no  model  for  a set  of  statements  may  seem  like  an  endless  task,  but  Herbrand's 
Theorem  at  least  offers  a systematic  search  procedure:  "A^  A^,  • . . 1— T if  and 

only  if  a certain  Boolean  expression  Q^.Q^ vanishes  for  some  finite  r.  A brief 

sketch  of  how  the  Q_  are  to  be  calculated  from  the  A.  and  T will  now  be  given. 

2 j 

Briefly,  the  idea  is  to  replace  all  existantial  quantifiers  by  functions,  to  express 

A,,  A- A , T is  product-of -sums  Boolean  form,  and  to  generate  the  Q.  from  these 

icrci  i 

factors  by  substituting  values  from  the  Herbrand  universe  for  the  remaining  variables. 
The  Herbrand  universe  consists  of  constants,  functions  of  constants,  function  of  func- 
tions of  constants,  etc. , the  functions  being  those  which  were  used  to  replace  the 
existential  quantifiers.  The  general  procedure  is  described  by  Davis^  Consider  the 
following  example: 

.A,  = (x)lf. 


XX 


f 1 

axioms  i 

^A,  = (xyz)(R  R 
2 ' ' ' xy  yz 


R ) 

xz' 


theorem  T = (xy)  (E  R ) 

' ” ' xy  yz 

The  processed  statements  are 

A,  = E 

1 ww 

A-  = E + E + E 

2 xy  yz  xz 

T=(E  )R  R =R.R. 

xy  xy  yx  ab  oa 

The  clauses  are  E ,E  +E  +R  ,R,,R,  and  the  Herbrand  universe  is  simply 
ww  xy  yz  xz  ab  oa  ’ 

H = {a,b].  The  are  formed  by  substituting  elements  of  H for  the  variables  w,x,y 
and  z in  the  clauses.  Here  there  are  only  20  different  possible. 

Q,,Q, = [R  . R,  E E.,(E  +E  +R  )(E  +E.  + R.) 

1’  2*  20  ab  T>a  aa  ob  aa  aa  aa  aa  ab  ab 

I t f f 

w=aw=b  x=y=z=a  x=y=a  z=b 

• • • ^^ab'*'  ^a'*’  ^aa^  • • ‘ ^ ^b  ^Sib'*’  ®bb*^ 

! f 

x=z=a  y=b  x=y=z=b 
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This  product  vanishes  because  a subset  of  factors  vanishes: 

R , R,  H (K.+K.+R  )=0 

ab  Da  aa  ab  oa  aa 


Thus,  it  is  established  that  T is  a theorem  which  can  be  inferred  from  axioms  A^  and 


The  above  example  can  be  given  many  interpretations,  among  real  numbers  we 
cannot  have  a<  b and  b<  a,  set  A cannot  properly  contain  set  B if  set  B properly  contains 
A,  etc.  Formally,  the  two  axioms  define  a transitive  irreflexive  ordering  relation. 

The  usual  way  to  prove  the  theorem  might  be 

(1)  Asstime,  contrary  to  the  Theorem  T,  that  we  can  simultaneously  have  R , 
and  Rj^^ 

(2)  A special  case  of  axiom  A,  then  says  that  R , R -»  R 

(3)  But  by  axiom  A^  it  is  always  false  that  R^^^^ 

(4)  Therefore  (1)  is  false  since  it  leads  to  a contradiction  and  the  Theorem  T is 
true. 

The  connection  between  these  four  steps  and  the  vanishing  of  Eq.  (1)  can  be  seen 
by  rephrasing  the  steps  in  symbolic  form: 

(1)  (Assumption) 

^ab^a^^ab^  \a  + \a>  = ^aa  <^“"8 

(3)  R R =0  (Then  using  axiom  1) 

33  33 

(4)  .'.  ^ab^a  ” ^ (Since  other  factor  ^ 0) 

Thus  it  is  seen  that  Herbrand  search  can  be  regarded  as  a systematic,  though  inefficient, 
way  to  combine  the  axioms  with  various  substitutions  to  try  to  establish  the  validity  of 
a suspected  theorem. 

The  above  example  was  unusually  simple  in  that  the  Herbrand  universe  was  finite. 
Suppose  an  axiom  or  T is  of  the  form 

(x)(Ey)(z)  R(x,y,  z) 

The  reduction  steps  give 

R(x,  f^,  z) 

and  the  Herbrand  universe  is  {a,  f^,  ff^,  . . . }.  Substituting  these  elements  in  all  pos- 
sible ways  for  x and  y gives 
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R(a,  a) 

R(s.fa.fJ 

R(£a.  ffa.  a) 

“a*  ^a) 

R(a,  ff^) 

Unless  guided  by  hvunan  insight  such  aimless  substitution  results  in  a Q^Q^.  . .Q^  = 0 
only  after  impossibly  long  times. 

3.  Alternate  Method  for  Theorem  Search 

This  method  differs  from  Herbrand  search  in  that  it  finds  a theorem  instead  of 
merely  proving  a given  theorem.  Also,  the  process  is  certain  to  give  some  information 
at  various  steps  before  completion,  this  information  being  in  the  form  of  weaker  forms 
of  the  find  theorem.  The  process  is  patterned  after  what  is  believed  to  be  the  process 
gone  through  by  a human  in  a similar  search,  and  in  this  regard  is  hoped  to  be  more 
efficient  than  the  blind  Herbrand  search.  There  are  difficult  steps  which  might  even 
require  small  Herbrand  proofs,  but  this  might  best  be  done  by  a human  interacting  with 
the  computer. 

The  basic  idea  is  to  have,  at  each  step,  a necessary  (N)  and  sufficient  (S)  condition 
for  some  well-formed  statement  (F).  In  terms  of  sets 

SiL  F^N 

Elements  not  covered  are  in  5^  i^N,  this  representing  cases  which  meet  the  necessary 
conditions  but  which  do  not  meet  the  sufficient  conditions.  The  basic  idea  is  to  succes- 
sively reduce  this  region  of  indecision  by  either  enlarging  S or  contracting  N.  Consider 
the  Venn  diagram  (Figure  6). 


..-■■A-  .■  ■ - _ 
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If  Nj  and  N^  are  necessary  and  sufficient  conditions  for  F then  the  area  shaded 
will  be  empty,  and  if  Sj  v S^  are  necessary  and  sufficient  conditions  for  F then  the 
I will  be  empty.  If  the  total  shaded  area  is  empty  then  either  combina- 


area  shaded 

tion  of  conditions  is  necessary  and  sufficient  set,  both  combinations  being  equivalent. 


The  above  suggests  several  ways  to  search  for  a theorem  of  the  form  under  dis- 
cussion. Suppose  conditions  Nj,N2,  . . . , and  N^  are  shown  to  be  necessary  for  property 
F.  Search  for  a member  of  (N^  and  N^, ....  and  N^)  and  F,  calling  such  an  element, 
if  found,  X.  The  next  step  is  to  generalize  element  x to  set  (N^(x),  i.e.  , the  set  of 
elements  satisfying  the  new  condition  N^.  Next  test  for  a member  of  R and  N^.  If  no 
member  exists,  then  enlarge  the  set  of  necessary  conditions  to  N^  and  N2  and  . . . N^ 
and  where  = N^(x)  and  repeat  the  whole  procedure.  Finally,  if  no  further  x 

can  be  found,  the  set  of  necessary  conditions  is  sufficient.  The  sufficient  set  S^  v S^ 

. . . V S^  is  expanded  in  an  analogous  way. 

More  details  on  this  theorem  search  method  will  be  given  in  a forthcoming  report. 
Examples  will  be  given  of  how  to  process  has  been  applied  to  finite  state  machines  and 
variable  length  codes,  and  how  it  might  be  applied  to  helping  to  prove  computer  programs. 


4.  Optimizing  Decision  Programs 

In  searching  for  theorems  by  the  method  outlined  above,  it  is  necessary  in  several 
places  to  quickly  search  for  an  element  x satisfying  n conditions  Dj,  D^,  . . . , and  D^. 
Suppose  condition  D^  is  checked  by  a subroutine  at  cost  (saying  in  time  and  that  con- 
dition Dj^  has  a probability  of  being  true).  What  is  the  best  order  in  which  to  make 
the  tests  so  as  to  minimize  their  expected  total  cost?  If  all  are  equal  obviously  the 
least  costly  should  be  done  first,  and  if  all  costs  are  equal  the  condition  with  the  smallest 
probability  of  success  should  be  done  first.  If  both  p^^  and  vary,  then  the  tests  should 
be  done  in  order  of  increasing  C^/(l  - p^).  This,  and  related  results,  were  obtained  by 
Slagle.^ 


C.  SNOBOL  Program 

The  SNOBOL  program  of  Fig.  2 is  a concrete  simple  example  of  a problem  in 
which  a complete  decision  cannot  be  expected  so  that  time  and  memory  Emits  are  written 
in.  The  language  is  also  of  interest  as  a model  for  studying  where  the  causes  for  ob- 
served errors  were  made,  since  there  is  essentially  only  one  statement  format.  Notice 
it  is  about  as  unstructured  as  possible:  almost  every  statement  is  a GO  TO'. 
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START  SYSPOT=  "max  length  was  " 1 

syspot=  "number  of  steps  was  " n 
syspot=  "" 


loop 

st=syspit 

lc"0" 

Id  + "1" 

/s(loop) 

st  */l  + "1"# 

again 

dt=st 

ns"l" 

dt  "0"  */"2"*  *y*  = y "00" 

/s(a) 

(Jt  II  JM  */ll2ll#  *y#  = y "1101" 

/f(nulla) 

a 

dt  "0"  */n2"*  *y*  = y "00" 

/s(b) 

dt  nlii  «/ii2ii«  *y#  = y "1101" 

/f(nullb) 

b 

at  "0"  *y*  = y "00" 

/s(c) 

c 

at  "l"  #/ii2"*  *y*  = y "1101" 
st  */"  60(max  length) 

/s(long) 

ns=n  + "2" 

n "401  (time  limit)" 

/ s(time) 

st  */l  + "1"* 

/f(equ) 

equ 

1=1  + "1" 
st  dt 

/f(ret) 

dt  st 

/s(per) 

ret 

return 

/(again) 

nulla 

n=n  - "1" 

nullb 

syspot=  "string  vanishes" 

/(start) 

long 

syspot=  "string  longer  than  max" 

/(start) 

time 

syspot="time  limit  exceeded" 

/(start) 

per 

syspot=  "becomes  periodic 

/(start) 

end 

syspot=  "end" 

100100100100100100 
becomes  PERIODIC 
MAX  LENGTH  WAS  34 
NUMBER  OF  STEPS  WAS  101 

100100100100100 
TIME  LIMIT  EXCEEDED 
MAX  LENGTH  WAS  56 
NUMBER  OF  STEPS  WAS  401 

1010101010 
STRING  VANISHES 
MAX  LENGTH  WAS  1 1 
NUMBER  OF  STEPS  WAS  12 

100100100100100100100 
STRING  LONGER  THAN  MAX 
MAX  LENGTH  WAS  59 
NUMBER  OF  STEPS  WAS  169 

Fig.  2.  SNOBOL  3 program  for  Post's  tag. 


A.E.  Laemmel 
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PROGRAM  PATHS  AND  THE  NUMBER  OF  TESTS  NEEDED  TO  VERIFY  A COMPUTER 
PROGRAM 

G.S.  Popkin  and  M.  L.  Shooman 

In  an  earlier  report^  it  was  shown  how  matrices  and  binary  programming  may  be 
used  to  determine  a lower  bound  on  the  minimum  nvimber  of  test  cases  needed  to  verify 
that  a program  agrees  with  its  flowchart.  In  this  work,  it  is  shown  how  flowchart  con- 
tents affect  the  actual  number  of  test  cases  needed,  and  a formula  for  the  upper  bound 
on  the  minimum  number  of  cases  is  given  and  proved. 

This  work  also  includes  a method  of  finding  the  actual  minimum  number  of  test 
cases  required,  using  binary  programming  and  an  enumeration  of  all  program  paths. 

A.  Upper  and  Lower  Bounds  on  the  Ntimber  of  Tests  Needed  to  Verify  a Program 

Using  the  flowchart  in  Fig.  1,  with  its  segment  numbered,  and  following  the  meth- 
ods described  in  Ref.  1,  the  matrices  M and  T may  be  constructed  with  k = 7 (Table  I). 
The  methods  of  Ref.  1 also  yield  5 as  the  size  of  the  maximum  incomparable  set,  which 
is  the  lower  bound  on  the  minimum  number  of  tests  needed  to  pass  through  every  seg- 
ment at  least  once. 

The  actual  minimum  number  of  test  cases  required  will  depend  on  the  contents  of 
the  flowchart  boxes.  Sufficient  conditions  for  achieving  the  lower  bound  are 

(1)  The  decisions  are  all  independent  of  one  another 

(2)  The  decision  variables  are  all  read  as  input 

(3)  The  program  does  not  modify  the  values  of  the  decision  variables. 

If,  on  the  other  hand,  the  decisions  have  a certain  degree  of  dependence,  the  lower 
bound  may  not  be  achievable.  The  flowchart  in  Fig,  2,  for  example,  has  the  same 
structure  as  that  in  Fig.  1 but  cannot  be  tested  with  only  5 cases.  Owing  to  the  depend- 
ence of  the  decisions,  6 tests  are  needed  in  order  to  pass  through  every  segment  at 
least  once.  There  are  several  different  sets  of  6 tests  which  will  exercise  every  seg- 
ment, but  no  set  smaller  than  6. 

By  changing  the  contents  of  the  boxes  in  the  flowchart  in  Fig.  2,  it  is  possible  to 
create  a different  flowchart  (but  with  the  same  structure)  that  will  require  7 tests  to 
exercise  all  segments. 

In  the  case  of  a flowchart  without  loops,  the  upper  bound  on  the  minimum  number 
of  test  cases  needed  to  pass  through  every  segment  at  least  once,  u,  is  given  by 

u = d + 1 

where  d is  the  number  of  deciders  in  the  flowchart. 


Fig.  1 with  contents . 
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TABLE  I.  M and  T for  the  flowchart  of  Figure  1. 

1 2 3 4 5 6 7 8 9 10  1 1 12  13  14  1 5 


0 0 10 


0 0 1 


10  0 0 
0 0 0 0 


0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 1 0 


000000001100000 


1 015555551111111 

2/0011  11  llOOOOOOOx 

3(0001  100000000001 

4I0OOOOOOOOOOOOO0I 
51000000000000000/ 

hlooooooooooooooof 

7/000000000000000I 

8X000001  lOOOOOOOO) 

9/001  1 1 1 1 loooooool 

10/001  1 1 1 1 lOOOOOOOl 

11  (001  1 1 1 1 loooooool 

12I001  1 1 1 1 10000000) 

13\004444441  1 1 1 01  1/ 

14  003333331110001 

15  002222221100000 

Proof:  Consider  a flowchart  with  no  deciders.  Such  a flowchart  has  one  segment 
and  requires  one  test.  Each  decider  added  to  the  flowchart  can  require  at  most  one  ad- 
ditional test,  so  u = d + 1. 

B.  Calculating  the  Number  of  Paths  and  Enumerating  the  Paths  in  a Flowchart,  Usin 


atrices 

The  matrix  T in  Table  I can  be  used  to  readily  reveal  the  total  number  of  paths  in 

2 

the  tlowchart  of  Fig.  1 of  Lipow.  Since  all  paths  begin  at  segment  1,  and  end  either  at 
segment  4,  5,  6 or  7,  the  total  number  of  paths  p is 

p = t,  . + t,  , + t,  / + t _ 

^ 1,4  1,5  1,6  1,7 

This  is  seen  to  be  20,  a number  which  can  be  obtained  also  by  the  decomposition 

3 

method  described  in  Ref.  1 or  by  the  direct  calculation  method  described  by  Shooman. 


j 
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An  enumeration  of  the  20  paths  may  be  generated  from  the  matrix  M,  and  the  enu- 
meration takes  the  form  of  a tree. 


Since  all  20  paths  begin  with  segment  1,  we  start  with  the  tree  with  node  1 

O 


From  the  matrix  M,  row  1 indicates  that  segments  2 and  13  can  follow  segment  1, 
so  2 and  13  are  added  to  the  tree. 


Row  2 of  M indicates  that  3 and  8 follow  2,  and  row  13  indicates  that  12  and  14 
follow  13.  These  are  added  to  the  tree. 


^ 0 

The  tree  is  built  level  by  level,  with  each  bottom  terminus  of  the  tree  directing 
to  a row  of  M,  until  construction  of  the  tree  terminates  due  to  a row  of  M containing  all 
zeros.  This  process  was  applied  to  the  flowchart  of  Fig.  1,  and  the  result  is  shown  in 
Fig.  3,  showing  the  segments  that  comprise  each  of  the  20  paths  in  the  flowchart. 


Fig.  3.  A tree  showing  all  the  paths  in  the 
flowchart  of  Figure  2. 

The  paths  so  enumerated  can  now  be  represented  by  a matrix,  which  will  prove 
useful  in  Section  C.  In  the  matrix,  P,  each  column  represents  a path,  and  contains  a 
1 for  each  segment  in  the  path.  The  matrix  P from  the  tree  in  Fig.  3 is  given  in  Table 
U. 
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TABLE  II.  The  path  matrix  for  the  path  tree  in  Figure  3. 


1 


5 6 7 8 9 10  11  12  13  14  15  16  17  18  19 


P = 


1 

0 

1 

0 

1 

0 

0 

0 

0 

0 

1 

0 

1 

1 


1111 


Further  work  is  needed  to  develop  a method  of  generating  P from  M by  matrix 
operations,  without  the  intervening  tree  enumeration. 


C.  Finding  the  Actual  Minimtun  Number  of  Tests  Needed 

To  find  the  minimum  number  of  tests  needed  for  any  given  flowchart,  start  with 
the  path  matrix  P and  remove  any  columns  which  represent  infeasible  paths.  For  ex- 
ample, to  find  the  minimum  niunber  of  tests  for  the  flowchart  in  Fig.  2,  remove  from 
P (Table  II)  columns  which  represent  paths  that  cannot  be  traversed.  One  such  path, 
for  example,  consists  of  segments  1,  2,  8 and  7.  It  cannot  be  traversed  because  of  the 
way  M is  set. 

The  flowchart  of  Fig.  2 is  small  enough  that  the  infeasible  paths  can  be  found  by 
inspection.  For  larger  flowcharts,  logical  operations  (AND,  OR,  etc.)  can  be  applied 
to  the  columns  of  P to  detect  infeasible  paths.  Methods  of  applying  such  logical  opera- 
tions will  be  shown  in  a later  report. 


In  very  large  flowcharts  where  flow  relationships  are  less  obvious,  it  may  not  be 
possible  to  find  and  remove  all  infeasible  columns  from  P.  Such  a defect  is  not  fatal, 
however,  and  can  be  remedied  by  a procedure  to  be  explained  later. 


After  the  infeasible  columns  are  removed,  the  remaining  matrix,  say  U,  can  be 
used  to  construct  a binary  programming  problem  whose  solution  is  the  number  of  test 
cases  required,  and  the  paths  that  those  test  cases  should  traverse.  Let  J be  the  index 
set  of  the  feasible  paths.  Then  the  binary  programming  problem  is 


minimize  z=  ^ x. 

j£j  •’ 


subject  to 


V j 
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UX  > 1 


Xj  = 0,  1 


where  X is  a vector  whose  transpose  is  (x^ljeJ).  The  solution  yields  a value  for  z which 
is  the  minim\im  number  of  tests  needed  to  pass  through  all  segments  at  least  once.  The 
variables 
problem 


variables  x^  refer  to  the  j feasible  paths,  and  in  the  solution  to  the  binary  programming 


v{ 


1 , if  path  j is  one  of  the  paths  to  be 
traversed  by  a test 


0,  otherwise 


As  an  example,  the  10  feasible  paths  from  P in  Table  II  are  1,  Z,  7,8,  11,  12,  15, 
16,  19  and  20.  So  the  binary  programming  problem  is 


minimize 

z = 

2: 

X. 

J 

j 

= 1. 

2,7 

.8, 

11, 

12, 15,  16,  1' 

subject  to 

1 

2 

7 

8 

11 

12 

15 

16 

19 

20 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

2 / 

1 

1 

0 

0 

0 

0 

0 

0 

0 

3 ( 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 ) 1 

4 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


0 

1 

0 

1 

0 

0 

0 

1 

1 

0 

0 


0 

1 

0 

1 

1 

0 

0 

0 

1 

1 

1 


0 

0 

1 

1 

1 

0 

0 

0 

1 

1 

1 


0 

1 

0 

I 

0 

1 

0 

0 

1 

1 

1 


0 

1 

0 

1 

0 

0 

1 

0 

1 

1 

0 


0 

0 

1 

1 

0 

0 

I 

0 

1 

1 

0 


11 

‘l2| 

‘l5 

‘l6 

'l9 

‘20 


Xj  = 0,1; 


jtJ 


This  is  a standard-form  set  covering  problem,  and  the  matrix  reductions  given 

4 

by  Garfinkel  and  Nemhauser  may  be  applied  to  U.  By  their  Reduction  4,  Tj  > r2  = 
r^  > r^,  so  rows  1,2  and  3 are  eliminated.  Also  by  Reduction  4,  rg  = r^^  > rj^^  > r^g  > 
r^,  so  rows  8,  13,  14  and  15  are  deleted.  By  Reduction  2,  r^  = Cj,  so  row  4 and 
column  1 are  deleted,  and  x^=  1.  Also  by  2,  r^  = e^.  so  row  5 and  column  2 are  deleted. 


and  X2  = 1 . 


minimize  z 


The  problem  that  remains  is 

^x.  3=7,8,11,12,15,16,19,20 


subject  to 
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7 

8 

11 

12 

15 

16 

19 

20 

*8 

6 

1 

0 

1 

0 

1 

0 

1 

0 \ 

1 *1  1 

1 

7 

0 

1 

0 

1 

0 

1 

0 

1 1 

1 11 

1 

9 

0 

0 

1 

1 

0 

0 

0 

0 1 

1 *12 

1 

10 

0 

0 

0 

0 

1 

1 

0 

0 I * 

I 1 

> 

1 

11 

0 

0 

0 

0 

0 

0 

1 

1 1 

1 10' 

1 

12 

1 

1 

0 

0 

0 

0 

0 

0 I 

1 *16 

1 

19 

‘20 


Xj  = 0,  1; 


je  J 


There  are  14  optimal  solutions,  with  = 4.  Some  of  them  are 


x^-  Xii^ 


x?=  x^  : 


*15"  *20"  ^ 
*16=  *19=  ^ 


11 


‘■16' 


*20  = ^ 


*7~  *12"  *15=  *19=  ^ 
*7=  *12=  *15=  *20  = ^ 


all  other  x.  = 0 
J 

all  other  x.  = 0 
J 

all  other  x.  = 0 


all  other  x.  = 0 
J 

all  other  ^ j = 0 


The  corresponding  solutions  to  the  original  problem  have  z=  6,  and 

etc. 


*1=  *2 


*7=  *11=  *15=  *20=  ^ 


all  other  x.  = 0 
J 


This  says  that  6 tests  are  required  to  pass  through  all  segments  at  least  once, 
and  the  6 tests  should  traverse  paths  1, 2,7,  11,  15  and  20  as  those  paths  are  defined  in 
the  matrix  U. 

As  mentioned  earlier,  the  matrix  U may  sometimes  contain  some  infeasible  paths, 
if  the  flow  relationships  in  P were  not  sufficiently  obvious  for  the  detection  and  removal 
of  all  infeasible  paths.  In  such  a case,  one  or  more  infeasible  paths  may  appear  in  the 
solution  of  the  binary  programming  problem.  Then,  when  an  attempt  is  made  to  con- 
struct test  data  to  traverse  the  paths  in  the  solution,  the  infeasible  paths  will  be  detected. 
At  that  time  they  can  be  removed  from  U,  and  the  binary  programming  problem  solved 
again.  The  process  can  be  repeated  if  necessary  until  a solution  free  of  infeasible  paths 
appears. 

The  essential  differences  between  the  procedure  described  here  and  the  binary 
programming  problem  of  Ref.  1 are  these: 

In  Ref.  1,  a m.atrix  F was  used  to  find  a maximum  incomparable  set.  F is  a 
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square  matrix  whose  dimension  is  equal  to  the  number  of  segments  in  a flowchart. 

Even  in  a large  flowchart  the  number  of  segments  might  not  be  more  than  several  hun- 
dred, and  usually  the  number  of  segments  is  much  smaller  and  manageable  in  a practical 
way.  F was  used  to  form  a maximization  problem,  in  which  the  solution  is  the  maxi- 
mum incomparable  set  of  segments,  and  also  the  lower  bound  on  the  minimum  number 
of  tests  needed  to  pass  through  each  segment  at  least  once. 

In  this  report,  a matrix  U is  used  to  find  a minimum  number  of  tests.  In  general, 
U is  not  square.  The  number  of  rows  equals  the  number  of  segments  in  the  flowchart, 
but  the  number  of  columns  equals  the  number  of  feasible  paths.  Even  in  a modest- 
sized flowchart,  the  number  of  feasible  paths  may  be  very  large  and  not  easy  to  find. 

U is  used  to  form  a minimization  problem,  and  the  solution  is  the  minimum  number  of 
feasible  paths  needed  to  pass  through  ("cover")  every  segment  at  least  once. 

Further  work  is  needed  to  develop  methods  of  deriving  U in  a manner  that  is  prac- 
tical for  real  flowcharts.  Also  later  an  attempt  will  be  made  to  apply  these  methods  to 
flowcharts  with  loops . 

Rome  Air  Development  Center 

F3060Z-74-C-0294  G.S.  Popkin  and  M.  L.  Shooman 
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APPUCATION  OF  LOGICAL  OPERATIONS  TO  FINDING  FEASIBLE  PATHS  IN  A 
PROGRAM  FLOWCHART 

G.S.  Popkin  and  M.  L.  Shooman 

In  Ref.  1,  a flowchart  was  given  in  which  it  was  desired  to  find  all  the  feasible 
paths.  The  flowchart  is  reproduced  here  as  Figure  1.  Also  in  Ref.  1,  a matrix  P was 
shown,  displaying  all  the  paths,  feasible  and  infeasible,  in  the  flowchart.  P is  repro- 
duced here  as  Table  1. 


Fig.  1.  A flowchart. 

The  flowchart  in  Fig.  1 is  small  enough  so  that  the  feasible  paths  can  be  found  by 
inspection,  as  was  done  in  Reference  1.  For  larger  flowcharts,  it  is  desirable  to  have 
a more  methodical  way  of  finding  the  feasible  paths.  The  method  proposed  in  Ref.  1 is 
to  find  the  infeasible  paths  by  logical  operations  upon  P,  and  then  to  remove  the  cor- 
responding columns  from  P.  If  the  method  of  finding  infeasible  paths  was  imperfect, 
as  it  promises  to  be,  and  some  infeasible  paths  remained  undetected,  then  the  remain- 
ing matrix  would  contain  not  only  all  the  feasible  paths  but  also  some  infeasible  ones. 
The  infeasible  paths  would  then  be  found  and  removed  by  methods  described  in  Ref.  1, 
as  the  matrix  remaining  from  P was  used  in  further  reductions. 


■1 
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1 

2 

3 

4 

5 

6 


9 

10 
1 1 
12 

13 

14 

15 


TABLE  I. 

12  3 4 

1111 
1111 
110  0 
10  0 0 

0 10  0 

0 0 10 

0 0 0 1 

0 0 11 

0 0 0 0 

0 0 0 0 

0 0 0 0 

0 0 0 0 

0 0 0 0 

0 0 0 0 

0 0 0 0 


The  path  matrix  for  the  flowchart  in  Figure  1. 


5 6 7 


8 9 10  11  12  13  14  15  16  17 


1 1 
0 0 
1 1 
1 0 
0 1 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
1 1 
1 1 
0 0 
0 0 


1 1 
0 0 
0 0 
0 0 
0 0 
1 0 
0 1 
1 1 
0 0 
0 0 
0 0 
1 1 
1 1 
0 0 
0 0 


1 1 
0 0 
1 1 
1 0 
0 1 
0 0 
0 0 
0 0 
1 1 
1 1 
0 0 
0 0 
1 1 
1 1 
1 1 


1 1 
0 0 
1 0 
0 0 
0 0 
1 0 
0 1 
1 1 
1 1 
1 1 
0 0 
0 0 
1 1 
1 1 
1 1 


1 1 
0 0 
1 1 
1 0 
0 1 
0 1 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
1 1 
1 1 
1 1 


1 1 1 
0 0 0 
0 0 1 
0 0 1 
0 0 0 
1 0 0 
0 1 0 
1 1 0 
0 0 0 
0 0 0 
0 0 1 
0 0 0 
1 1 1 
1 1 1 
1 1 0 


18  19  20 

1 1 1 
0 0 0 \ 
10  0 1 

0 0 0 I 

1 0 0 / 
0 10/ 
0 0 1 I 
0 11/ 

0 0 0 I 
0 0 0 1 

1 1 1 \ 
0 0 0 ) 
111/ 
1 1 1 

0 0 0 


The  approach  taken  in  finding  columns  of  P which  represent  infeasible  paths,  and 
which  failed,  was  to  establish  vectors  describing  characteristics  of  feasible  paths,  and 
then  performing  logical  operations  between  the  vectors  and  columns  of  P.  For  example, 
from  the  flowchart  it  can  be  seen  that  any  feasible  path  containing  segment  2 must  also 
contain  segment  3,  and  conversely.  This  can  be  expressed  as  a column  vector  whose 
transpose  is  Cj  = (0,1,1,0,0,0,0,0,0,0,0,0,0,0,0).  Also,  any  feasible  path  contain- 
ing segment  12  must  also  contain  segment  8,  but  not  conversely.  Perhaps  this  could 
be  expressed  as  C^=  (0,0,  0,0,  0,0,0,  1,0,  0,0,  1,0,  0,0).  In  any  case,  the  tester  of  the 
program  could  construct  a number  of  such  vectors  describing  relationships  existing  in 
the  flowchart. 


It  may  be  possible  to  perform  AN  Ding  operations  between  the  and  the  colximns 
of  P and  to  find  columns  which  represent  infeasible  paths.  For  example,  the  following 
was  tried  and  found  wanting:  Any  column  of  P,  say  k^,  represents  an  infeasible  path 
if  and  only  if 

C.  • k.  / C.  all  i (1) 

Using  just  Cj  and  C^t  it  so  happens  that  Eq.  (1)  would  correctly  detect  paths  3,4, 

5 and  6 as  infeasible.  It  would  incorrectly  call  paths  11  and  12  infeasible.  Equation 
(1)  could  be  made  to  work  for  paths  11  and  12  by  adding  to  the  set  of  the  a vector 
which  says  that  any  path  containing  segment  9 must  also  contain  segment  8,  namely 
Cj  = (0,  0,  0,  0,  0,  0,  0,  1 , 1 , 0,  0,  0,  0,  0,  0).  But  for  this  procedure  to  be  workable  it  can- 
not depend  on  the  set  of  the  being  complete.  While  it  is  not  fatal  if  the  procedure 
leaves  some  infeasible  paths  undetected,  it  must  not  under  any  circumstances  incorrect- 
ly call  a feasible  path  infeasible. 


• • ■ . 


I 
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In  an  attempt  to  remedy  the  defect  involved  in  needing  a complete  set  of  the  Cj^, 
ANDing  with  single  vectors  instead  of  the  complete  set  was  tried.  Unfortunately, 
this  approach  also  failed. 


Another  approach  would  be  to  change  the  inequality  in  Eq.  (1)  to  an  equality  and 
search  for  feasible  paths  instead  of  infeasible  ones,  but  there  still  is  the  risk  in  that 
method  that  some  feasible  paths  would  be  called  infeasible  because  of  the  set  of  the 
being  incomplete. 


Also,  an  approach  using  ORing  was  tried,  with  results  that  were  even  less  ac- 
ceptable than  those  from  ANDing. 

Rome  Air  Development  Center 

F30602-7  4-C -0294  G.S.  Popkin  andM.L.  Shooman 
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PROPOSED  MEASURES  FOR  THE  EVALUATION  OF  SOFTWARE 
S.N.  Mohanty  and  M.  Adamowicz 

Comprehensive  testing  of  software  is  the  key  route  to  achieve  increased  software 
reliability.  If  good  measures  for  testing  coverage  can  be  defined,  systematic  exercise 
of  software  can  lead  to  real  improvement  in  the  near  term.^  To  achieve  this  testing 
measure  the  basic  approach  is  to  partition  each  complex  module  into  a series  of  smaller, 
manageable  internal  modules,  each  of  which  is  dealt  with  separately.  After  the  internal 
modular  structure  of  the  program  has  been  identified,  appropriate  aggregation  of 
modules  are  made  and  the  iteration  structure  for  each  is  identified.  The  computation 
is  performed  as  follows.  The  hierarchy  of  program  sub-schema  is  found  by  automatical- 
ly performing  a series  of  reductions  namely  series,  parallel  and  self-loop.  ’ The 
main  objective  of  deriving  the  sub-schema  structure  is  to  identify  self-loop  reduction 
points  and  developing  a search  strategy  for  minimizing  the  combinatorial  growth  within 
the  backtracking  process.  Then  the  real  program  is  analyzed  by  appropriate  techniques 
so  as  to  identify  conditions  which  might  be  met  within  a program  input-space  to  satisfy 
certain  test  objectives. 

This  report  is  concerned  with  the  definition  and  development  of  two  measures  for 
characterizing  the  structural  quality  of  computer  programs.  These  measures  are  then 
used  to  define  and  measure  a program's  testedness  (that  is,  how  well  a program  is 
tested  after  a series  of  tests).  Examples  have  been  included  to  clarify  and  demonstrate 
the  definitions  and  show  how  such  measures  can  be  used  to  study  and  compare  different 
programs.  It  should  be  remembered  that  the  testedness  measures  applies  to  the  pro- 
gram itself  and  not  to  what  the  program  should  have  been.  Thus  this  measure  might  be 
very  different  from  a reliability  measure  in  the  usual  sense. 

A.  Software  Evaluation  Measures 

A computer  program  consists  of  executable  modules  which  are  either  segments 
or  nodes.  Segments  consist  of  a contiguous  set  of  assignment  and  unconditional  execu- 
tion statements  and  nodes  consist  of  conditional  or  branch  statements.  The  nature  and 
the  logical  distribution  of  the  segments  and  nodes  of  a computer  program  determine  its 
particular  structural  qualities.  In  order  to  identify  and  quantify  such  qualities,  we  will 
introduce  some  new  terminology.  The  term  accessibility  is  introduced  for  quantifying 
the  concept  of  reaching  to  or  accessing  an  executable  module  during  program  testing. 

It  is  also  assumed  that  before  an  executable  module  is  accessed  we  have  successfully 
executed  the  preceding  modules.^  The  term  testability  has  been  introduced  to  quantify 
the  ease  of  executing  a program  module.  Primary  factors  determining  the  testability 
of  a module  are  its  accessibility  and  complexity  measures.  The  complexity  measure 
determines  how  much  resources  are  required  in  executing  the  module. 
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The  testedness  concept  has  been  introduced  so  as  to  evaluate  the  software  testing 
process.  It  is  a function  of  testability  and  frequency  of  execution. 

B.  Logical  Structure 

Definition:  A logical  structure  (abbreviated 'to  LS)  is  defined  as  a collection  of 
at  least  two  nodes,  one  of  which  is  a start  node  through  which  the 
control  enters  the  LS  from  the  outside  environment  and  the  other  being 
a stop  node  through  which  the  control  leaves  the  LS  to  the  outside 
environment.  The  two  nodes  are  connected  by  a directed  line  segment. 
The  nodes  are  arranged  in  hierarchial  levels,  start  node  being  in 
level  1 and  the  levels  of  the  succeeding  nodes  being  determined  by 
the  largest  count  of  the  non-repeating  line  segments  to  the  node  from 
the  start  node. 

LS  = {(N.^,  Nj^j)li,j,k  > 1,  k > l] 


where 


N. . = j-th  node  in  the  i-th  level, 

ij  •' 

N,  ,=  1-th  node  in  the  k-th  level, 
kl 

= directed  line  segment  from  the  j-th  node  in  the  k-th  level. 
Two  examples  of  the  logical  structures  are  portrayed  in  Figure  1. 


izm  )|i2i 


START  NODE  -N,, 
STOP  NODE  - Nji 

LS'[(N||,I||2|,N2|), 
IN21,  Iziii'  Nir)J 


START  NODE  - N,, 
STOP  NODE  - N31 

LS-[(N||.I||2i,N2|). 

(N21. 1213I'  '^3iK 

1^31,  Isiiii  ^11'. 


Fig.  1.  Examples  of  logical  structures. 


i 
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C,  Computer  Program  Equivalence  to  Logical  Structure 

Any  computer  program  can  be  thought  of  as  a collection  of  modules,  each  module 
containing  one  or  more  executable  statements,  connected  together  directly  through 
transfer  statements  (goto,  call)  or  indirectly  when  modules  occur  in  sequence.  The 
computer  program  has  the  following  properties: 

(1)  Each  program  has  a start  node  and  a stop  node.  Start  node  is  the  statement 
which  signals  the  beginning  of  execution  (i.e.  , the  control  enters  the  program 
environment)  and  the  stop  node  is  the  statement  which  signals  the  completion 
of  execution  (i.e.,  the  control  leaves  the  program  environment). 

(2)  Each  program  module  can  be  arranged  in  a hierarchial  level  and  connected 
to  other  modules  through  directed  line  segments  (i.e.  , direct  or  indirect 
transfer  statements). 

If  we  can  replace  each  module  and  the  associated  transfer  statements  by  an  equi- 
valent node  and  a line  segment  then  the  resulting  structure  is  a LS.  Thus  any  computer 
program  can  be  represented  by  an  equivalent  LS. 

D.  Accessibility 

Definition:  The  accessibility  of  an  executable  module  is  defined  as  a sum  of  the 
products  of  the  accessibilities  of  the  preceding  modules,  times  the 
probability  of  traversing  the  path  from  the  preceding  module  to  the 
module  under  consideration  and  the  probability  of  successful  execution 
of  the  preceding  module.  The  accessibility  of  the  start  module  is  1. 

Consider  a program  consisting  of  many  modules,  that  is,  segments  which  are 
collections  of  sequentially  executable  statements  and  nodes  which  are  either  labels  or 
conditional  statements.  Then 

i j 

A,  , = V * Q...  , * P. . 
ijkl 

where 

Aj^^  = Accessibility  of  the  1-th  module  in  the  k-th  level,  under  consideration 
A..  = A.ccessibility  of  the  j-th  module  in  the  i-th  level 

= Probability  of  traversing  the  path  from  module  (ij)  to  the  module  (kl) 

P_  = Probability  of  successful  execution  of  module  j in  level  i. 

If  we  assume  that  the  probability  of  successful  execution  of  any  segment  or  node 
is  inversely  proportional  to  the  number  of  basic  executable  statements  contained  in  the 
node  or  segment  where  the  basic  statement  is  of  the  form  VARIABLE  '.  ; = VARIABLE 
OPERATOR  VARIABLE  or  VARIABLE  I ; = OPERATOR  VARIABLE.  Then  the  proba- 
bility of  successful  execution  of  the  node  or  segment,  P^^,  is  inversely  proportional  to 
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C.  .,  where  C. . is  the  number  of  basic  executable  statements  in  the  segment  or  node, 
ij  ij  ® 

Then,  P.^  = k/C^.,  where  k = constant  of  proportionality.  As  a special  case,  if  k = 1 
then  the  accessibility  has  the  following  form 


kl 


(A../C.) 

*^ijkl 


* Q. 


ijkl 


As  an  example  of  the  calculation  of  accessibilities,  consider  the  flowchart  of  Figure  2. 


Fig.  2.  A flowchart  for  the  illustration 
of  accessibility  calculations. 

The  accessibilities  of  the  modules  are: 


^11  " ^ 

■^21  “ ‘ 

A3^=  1/3 
A32  = 1/3 

A33  - 1/3 

^41  = ^31'^  ^32^^  = 

A42  = 1/6 

^stop  = ^41"^  ^42+  ^33= 

(Note;  We  have  assumed  that  each  module  has  one  basic  executable  statement.) 

E.  Testability 

Once  we  have  computed  the  accessibilities  of  the  modules,  we  then  compute  the 
module  testabilities.  Conceptually,  it  is  the  ease  with  which  a module  can  be  accessed 
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and  executed  in  any  test  sequence.  It  is  the  function  of  the  module  accessibility  and  the 
resources  required  to  execute  the  module.  The  resources  can  be  proportional  to  the 
number  of  basic  executable  statements  in  the  module.  It  can  also  be  time  and  money 
required  to  execute  the  module. 

Definition:  The  testability  of  an  executable  module  is  defined  as  a ratio  of  the 

module  accessibility  and  the  resources  required  to  execute  the  module. 


Then, 


T..  = A../R.  . 
ij  IJ  iJ 


wnere , 


T. . = Testability  of  the  module  j in  level  i 
A..  = Accessibility  of  the  module  j in  level  i 

R_  = Resources  required  for  executing  the  module  j in  level  i. 


In  the  above  expression  R^^  0,  then  T^^^ 


cc , meaning  we  can  execute  an  infinite  num- 


ber of  times  without  incurring  any  expense.  On  the  other  hand,  if  R^j  -•>  then  T^^  0 

meaning  we  cannot  execute  the  module  at  all  because  of  the  enormous  cost  involved  in 
executing  the  module. 

Let  the  resources  required  to  execute  any  program  module  be  directly  proportional 
to  the  complexity  of  the  program  module.  Then, 


where, 

L. . = constant  of  proportionality  for  module  (ij) 
C.  . = complexity  of  module  (ij) 


Then, 


R..  = A.  ./R.  . 
ij  D IJ 


T..  = A../(L.  . * C..) 
ij  ij  D IJ 


F.  Testedness 

Before  giving  the  measures  for  evaluating  the  software  in  terms  of  the  accessi- 
bilities we  observe  that: 


(1)  One  has  to  conduct  an  optimum  number  of  tests  for  testing  the  program.  In- 
creasing the  number  of  tests  beyond  this  optimal  level  should  not  increase 
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the  program  testedness  appreciably  but  decreasing  the  number  of  tests  below 
this  level  should  decrease  the  program  testedness  appreciably. 

(2)  Given  a fixed  number  of  tests  to  be  performed  the  net  contribution  towards 

the  overall  program  testedness  by  the  segments  with  smaller  testability  should 
be  relatively  large. 


Definition:  The  testedness  of  a module  is  defined  as  a function  of  the  module 
testability  and  its  frequency  of  execution  during  program  testing. 


Then, 


-(f.  T.  .) 

W. . = 1 - e 
ij 


where. 


f. . = Frequency  of  execution  of  the  module  j in  level  i 
Tj^^  = Testability  of  the  module  j in  the  level  i 
W..  = Testedness  of  the  module  j in  the  level  i. 


Testedness  of  a program. 


1 J i j 

W = (P. . * W.  .)/  V V (p. .) 

/ / 'll  / i 'in' 


^ ijkr  ^ 


IJ 


IJ 


£ 


ijkl 


= 1 


ij 


where 


P P. . = Probability  of  execution  of  module  j in  level  i 

W. . = Testedness  of  the  module  j in  level  i 
ij  ■' 

W = Program  testedness. 

As  an  example  of  these  concepts  consider  the  flowchart  of  Figure  3. 


Fig.  3.  Flowchart  for  the  illustration 
of  testedness . 

T^i  = Testability  of  module  21 
^22  ” Testability  of  module  22 
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f 2 I = Frequency  of  execution  of  module  21 
f^2  ” Frequency  of  execution  of  module  22 
= Testedness  of  module  21 
W^2  = Testedness  of  module  22 


W = Program  testedness 


P = P 
21  22 


0.  5 


The  following  table  shows  the  program  testedness  for  different  values  of  testability  and 
frequency  of  execution. 

TABLE  I.  Program  testedness  for  different  values 
of  testability  and  frequency  execution. 


T 

21 

'^22 

^21 

^2 

w 

w 

22 

w 1 

0.  5 

0.  5 

1 

10 

0.  393 

0.  993 

0.  693 

0.  5 

0.  5 

10 

10 

0.  993 

0.  993 

0.  993 

0.  5 

0.  5 

10 

100 

0.  993 

1. 0 

0. 9965 

0.  001 

0.  12 

1 

29 

0.  001 

0.  97 

0.  485 

0.  001 

0.  005 

1200 

1200 

0.  699 

0.  997 

0.  848 

G.  Conclusion 

In  this  report  we  have  introduced  some  new  concepts  and  proposed  measures  for 
evaluating  software.  Usefulness  of  testability  and  testedness  has  been  demonstrated 
by  examples.  The  examples  included  are  rather  small  but  the  concepts  can  be  extended 
to  a program  of  any  size.  These  concepts  can  be  used  to  automatically  test  a computer 
program  and  evaluate  its  testedness. 

Rome  Air  Development  Center 

F30602-74-C-0294  S.N.  Mohanty  and  M.  Adamowicz 
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SEEDING/TAGGING  ESTIMATES  OF  THE  NUMBER  OF  SOFTWARE  ERRORS 
B.  Rudner 


Previous  reports  have  analyzed  estimates  supported  by  three  different  models  of 
the  debugging  process.  The  first  model  assumes  that  all  errors  have  equal  probability 
of  discovery.  Under  this  assumption,  the  maximum  likelihood  estimate  of  N,  the  total 
number  of  errors  in  a program,  is  N^  = st/c,  where  t errors  are  either  deliberately 
inserted  in  a program  (seeded)  or  found  by  debugging  (tagged);  s errors  are  found  by  a 
debugger  unaware  of  the  contents  of  the  first  set;  and  c is  the  number  of  errors  appear- 
ing in  both  sets.  Chapman^  shows  that  N^  is  biased  and  gives  simple  approximate 
forms  for  the  bias  and  mean-squared  error*  which  are  stated  to  be  valid  for  the  ratio 
st/N  > 10.  An  analysis  of  a modified  maximum -likelihood  estimate  N^  = 1 

was  given  in  the  same  paper.  Nj  has  the  advantage  of  being  practically  unbiased. 


A.  Comparison  of  Statistics  Computed  in  Different  Ways 

Other  estimates  have  also  been  considered.  Among  them  are  estimates  involving 
more  than  one  value  of  c,  found  from  several  tests.  It  was  necessary  to  use  methods 
other  than  those  used  by  Chapman  to  find  the  statistics  of  two  of  these  multi-trial  esti- 
mates, and  a method  based  on  Taylor's  series  was  appUed  to  find  formulas  for  mean 
and  mean-squared  error.  A by-product  of  this  work  was  that  by  reducing  the  number 
of  trials  to  1 in  the  appropriate  formulas  we  have  alternative  formulas  for  the  statistics 
of  N . It  was  pointed  out  in  Ref.  2 that  the  results  obtained  in  this  way  were  consider- 
ably lower  than  those  found  previously.  A second  inconsistency  was  noted  when  com- 
parison was  made  with  some  specific  cases  in  a tabulation  containing  means  and 
variances  computed  directly  from  probabilities. 

As  a result.  Chapman's  simplified  formulas  were  checked.  It  was  found  that  cer- 
tain approximations  used  in  deriving  the  formulas  introduced  considerable  error  unless 
N,  s and  t were  actually  very  large  numbers;  the  condition  st/N  large  was  not  sufficient, 
Table  I lists  three  cases,  for  all  of  which  st/N  = 13.  33,  giving  percent  error  in  the 
original  approximation  formula  for  E(N^).  Clearly  the  error  decreases  as  the  magni- 
tudes increase.  The  mean-squared  error  formula  shows  an  even  larger  deviation. 


The  bias  resulting  from  the  approximate  formula  was,  for  small  N,  not  much 
larger  than  the  error,  indicating  that  both  bias  and  dispersion  might  actually  be  con- 
siderably lower  than  appeared.  Consequently,  it  was  necessary  to  derive  second-order 
approximations  for  bias  and  mean-squared  error,  more  accurate  than  Chapman's 


‘ Because  of  the  bias,  the  mean-squared  error  V(N)  - E[(N  -N)  ] rather  than  the 
variance  was  taken  as  a measure  of  dispersion. 
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TABLE  1.  Percent  error  in  first  approximation  formula  for 
E(N^)  for  several  examples  with  st/N  = 13.33. 


j (st/N  = 13.33) 

% error  in  approximate  | 

N,  s , t 

i 

formula  for  E(N  ) , 

; 30,20,  20 

- . 

^ 270,  60,  60 

3.  5 ' 

; 3000,200,200 

1 1 

1.1  j 

first-order  approximations  but  still  simpler  than  the  exact  expressions,  in  order  to  I 

permit  quick  calculation,  to  provide  insight  into  the  manner  in  which  bias  and  dispersion  i 

change  with  changing  parameter  values,  and  to  facilitate  comparison  with  other  esti-  I 

mates . 

On  the  basis  of  the  new  approximations,  additional  information  was  obtained  on 
the  manner  in  which  bias  and  mean-squared  error  change  with  the  parameters,  informa- 
tion which  is  useful  in  designing  an  actual  estimation  experiment. 

B.  Bias 

The  new  approximation  for  the  expected  value  of  N^  derived  from  Chapman's  exact 
result  is 

E(N  ) 24  st  [a  + a_  + 2 a + 6 a + ...  + (m  - 1)1  a ] (1) 

o 1 2 3 4 m 

where 

A n+1 
*^1  (s+l)(t+l) 

A n+i  i ^ o 1 

“i  " '"i-1  fi+i){t+iy  ■ 1 - 2.3,  . . . 

The  requirements  for  accuracy  are  the  following; 

1.  A sufficient  number  of  terms  must  be  included  in  the  finite  sum  to  leave  the 
remainder  term  of  the  infinite  sum  from  whhci  it  is  drawn  insignificant.  Four  or  five 
have  been  found  adequate. 

> 

2.  The  probability  that  c = 0 must  be  very  small.  By  referring  to  the  examples 
of  hypergeometric  distribution  in  Ref.  4 one  sees  that  this  occurs  when  the  peak  is  far 
from  0,  i.e.  , when  the  mean  of  the  distribution,  st/N,  is  large.  In  fact,  common 
sense  tells  us  that  large  samples  (i.e.  , s and  t large,  and  therefore  st/N  large)  are 

! almost  certain  to  have  elements  in  common;  P(0)  a:  0.  st/N  > 3 seems  to  be  sufficient 

for  accuracy  unless  N is  very  large  (in  which  case  the  variance  of  the  distribution  and 
It*.. 

therefore  P^  are  large). 


■-’'m 
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An  alternative  form  of  Eq.  (1),  derived  by  simple  manipulations  is; 

E(N^)=^N[kj+  k2<7t)+  2k3(^)2+  ...  + (m-l)!k^(^r-']  (la) 

where 

. _ 1+1/N 

(i+i/s)u+i/ty 

I 1 1+i/N  • _ 7 1 

h""  h-l  (l+i/s)(l+i/t)  ’ 1-2,3,... 

The  quantities  kj^  are  close  to  1 and  approach  1 in  the  limit  as  s,t  and  N increase.  If 
we  set  all  k^  = I , we  arrive  at  Chapman's  approximate  formula 

E(N^);^N[1  + (5L)+2(N)2^  _ 

The  formula  resulting  from  the  Taylor's  series  expansion  is 

E(N^)SN[1  + q(^)  + 3q^^)^l  (2) 

where 

q = (N-s)(N-t)/N^ 

This  is  subject  to  the  same  caveat  as  Eq.  (1):  truncation  effect  and  ^ 0 are 
possible  sources  of  error.  Both  tend  to  show  u;^  for  small  st/N;  and  for  large  values 
ofN,s,t,  i.e.,  values  for  which  min(s , t)  » st/N. 

The  bias  b,  of  an  estimate  N is  defined  by  E(N)  = N + b.  The  quantity  of  greatest 
interest  is  the  ratio  b/N  (or  percent  bias  = b/N  x 100%)  since  to  estimate  N = 100  as 
N = 120  is  clearly  a grosser  error  than  to  estimate  N = 1000  as  1020. 

The  percent  bias  of  N^  varies  in  three  different  ways;  with  size  of  tagged  and 
sampled  sets  relative  to  total  number  of  errors,  quantified  by  the  ratio  st/N;  with  the 
total  number  of  errors  N;  and  with  size  of  sampled  set  relative  to  size  of  tagged  set, 
s/t.  The  nature  of  each  variation,  with  the  other  two  sources  held  constant,is  as  follows; 

(1)  b/N  decreases  as  st/N  increases,  for  N and  s/t  constant 

(2)  b/N  increases  withN,  for  st/N  and  s/t  fixed 

(3)  For  st/N  and  N fixed,  b/N  is  greatest  with  s/t  = 1. 


IT 


[ ' 
I ■ 


»• 


Figure  1 illustrates  these  variations,  and  a fourth  discussed  below. 

The  first  property  states  the  unsurprising  fact  that,  given  a particular  program, 
large  samples  produce  accurate  estimates.  The  third  property  says  that  if,  in  addition. 
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0 100  1,000  10,000  100,000 

Fig.  1.  Variation  with  several  parameter  relations 

of  percent  bias  of  N . 

^ o 

we  make  s and  t unequal,  we  increase  the  accuracy  of  N^  still  more.  However,  in  both 
cases,  the  increased  accuracy  is  paid  for  in  time:  larger  sets  of  errors  take  longer  to 
find,  and  s+t  increases  as  s/t  departs  from  1. 

The  second  property  says  that  under  the  same  conditions  of  st/N  and  s/t  we  get 
better  results  for  programs  with  fewer  errors,  e.g.,  by  estimating  N after  some  de- 
bugging has  been  done.  However,  as  N increases,  keeping  st/N  constant  requires 
relatively  smaller  samples.  For  N = 1000,  for  example,  s = t=  100  gives  st/N  = 10, 
while  for  N = 250,  we  get  the  same  value  of  st/N  with  s = t = 50.  That  is,  in  the  second 


■'  'It- . 
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case,  100  bugs  must  be  found  while  in  the  first,  with  N four  times  as  large,  only  200, 
or  twice  as  many  bugs  must  be  found.  If  we  spend  the  same  time  relative  to  N and  find 
400  bugs  in  the  first  case,  we  increase  st/N  by  a factor  of  2 and  decrease  bias  con- 
siderably. To  sum  up  the  argument,  for  an  amount  of  debugging  time  (as  measured  by 
s + t)  proportional  to  N,  N^  has  smaller  bias  for  large  N. 

C.  Mean -Squared  Error 

New  approximate  formulas  for  V(N^),  mean-squared  variation  about  the  true  value 
N,  were  derived  using  Chapman's  method  and  the  Taylor's  series  method.  They  are 
respectively, 

V(N^)=tN^  (1  + ^ r-2aj+  (|i-  2)a2+  (3^-  4)03+  ... 


{a  1 ^ - 2(m-  1)!  }a  ] ] 
’•  m - 1 N ' ' m 


where  the  q's  are  defined  in  Eq.  (1)  and 


A 


m 


1 ^ (m  - 1 )'. 


m-1 

j=l 


I 

j 


(3) 


V(N^)  a:  N^[q(N/st)  + 9q^(N/st)^]  (4) 

where  q is  defined  as  in  Equation  (2). 

An  alternative  form  for  Eq.  (3)  is 

V(N^)  N^[(l  + 2kj)+  (3k3-  2k2>^+  ( 1 Ik^  - 4k3)(^)^+  ... 

+ (A  ,k  - 2(m-2)!k  (3a) 

m-1  m m-l''st  ' 


where  the  k's  are  defined  as  in  Equation  (la). 


The  formulas  hold  under  the  same  conditions  as  the  mean  formulas:  P 0,  and 
enough  terms  in  the  series.  Furthermore  the  same  generalizations  can  be  made  with 
respect  to  the  variation  of  V(N^)  with  N,  s,  t. 

D.  Modified  Maximum  Likelihood  Estimate  Nj 

An  intermediate  result  in  Chapman's  derivation  of  E(N^), 


E( 


N+1 


c+1  ’ (s+l)(t+l) 


(1-K) 


where 
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K={ 


n-s  -t 
N+1 


for  s + t < N 


otherwise 


suggested  the  modified  estimate 


3 (s+l).(t+  U . j 
c+1 

as  a means  of  reducing  the  bias  to  practically  zero  assuming  0.  For 


E(N^)  = (s+l)(t+l)  E(^)  - 1 = (N+Dd-K)  - 1 
E(Nj)  = N -K(N+1)  where  K^O  if 


Therefore 

E(N^)~  N 

The  bias  of  N ^ is  negative  but  very  small  even  for  st/N  small.  For  example  con- 
sider the  case  N=  6,  s=  2,  t=  3,  st/N=  1.  E(N^).  computed  exactly,  is  5.8,  and  b/N  is 

3.3%  whereas  E(N^)  is  6.  6 with  b/N  = 10%.  However,  forNj  as  for  N^,  b/N  increases 
if  st/N  is  held  fixed  but  N increased.  IfN=20,  s=4,  t=5,  st/N  is  still  1 but  E(N  j)  is 
now  16.  9 and  b/N  = 15.  5%. 

An  additional  advantage  of  N ^ is  the  fact  that  its  variation  about  N is  somewhat 
lower  than  V(N^)  for  N greater  than  50.  Below  50,  V(N^)  is  smaller.  The  second-order 
approximation  for  V(Nj),  under  the  same  approximation  rules  as  E(N^)  and  V(N^)  is 

V(Nj)  ~ (s+l)^(t+l)^[a2+  Q3+  2a^+  6Q5+  ...  + (m-2)' - (N+1)  (5) 

=r(s+l)2(t+l)^^)^[(k2-k2)+k3(fj)+  ... 

+ (m-2)!k^(^)"’'^]  (5a) 

Let  0 = ^/V■.  The  variation  of  a (N,)/N  with  relations  among  the  parameters  as  des- 

e el 

cribed  in  detail  in  Section  B is  plotted  in  Figure  2. 

E.  Useful  Range 

It  is  obviously  possible  to  make  accurate  and  precise  estimates  with  large  enough 
samples;  the  Umiting  case  of  s = t = N produces  a perfect  estimate.  Whether  a good 
estimate  can  be  made  with  considerably  smaller  samples  is  the  issue.  Nj  has  almost 
no  bias  so  the  major  problem  resides  in  the  variance  (which,  for  zero  bias,  equals  the 
mean-squared  error).  As  Eq.  (5)  and  Fig.  2(a)  show,  the  variance  is  low  for  the  ratio 


482 


SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 


Fig.  2.  Variation  with  several  parameter  relations 

of  0 /N  for  N , . 
e'  1 

st/N  large  enough.  But  large  ratios  can  be  attained  with  relatively  small  samples  only 
for  N large,  e.g.,  for  N=  3000,  st/N  = 13.33  can  be  realized  with  s = t = 200,  or  one- 
fifteenth  of  N;  but  for  N=  30,  st/N  = 13.33  requires  s = t=20,  two-thirds  of  N.  Fortunate- 
ly* Eig . 2(b)  shows  that  smaller  values  of  st/N  are  required  to  give  a specified  value 
of  o^/N  at  the  30-error  level  than  at  3000.  The  s+t/N  =1.0  curve  in  Fig.  2(d)  shows 
the  minimum  value  of  o^/N  which  can  be  attained  if  we  limit  s and  t to  half  of  N.  If  we 
are  willing  to  accept  larger  samples,  we  can.  of  course,  do  better  for  the  smaller 
values.  Larger  samples  mean  more  time.  For  the  same  time  relative  to  N,  estimates 
of  larger  programs  will  have  lower  o^/N  (Fig.  2(d)).  Curves  such  as  those  of  Fig.  2 
can  be  exploited  to  design  an  estimation  test  with  knowledge  of  the  trade-off  between 
time  and  precision. 

F.  Design  of  a Seeding/Tagging  Reliability  Test 

The  procedure  is  very  simple.  Our  objective  is  to  pick  values  for  s and  t which 
will  be  likely  to  produce  an  estimate  of  the  quality  we  want.  We  begin  with  a ballpark 
estimate  of  the  number  of  errors  in  the  program,  based  on  whatever  information  we 
have  --  length  of  program,  amount  of  previous  debugging,  experience  with  other  pro- 
grams of  the  same  type,  expertise  of  programmers.  Suppose  we  decide  there  are 
probably  about  150  errors.  In  that  event  Nj  is  the  preferred  estimate  since  it  is  prac- 
tically unbiased  and  has  lower  variance  than  N^  in  that  range.  Had  the  estimated  N been 
below  50  we  should  have  had  to  check  the  bias  and  dispersion  of  N^  and  then  choose 
between  N and  N,  . 
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We  will  be  content  with  a = 30.  Then  o /N  = . 2 and  from  Fig.  2(b)  we  find  that 

Q e 

the  intersection  of  150  and  . 2 is  on  the  curve  for  st/N  = 13.33.  (If  Fig.  2(a)  contained 
a curve  for  N=  150,  we  could  have  found  the  same  information  there.)  Then  st=  13.33x 
150=  2000.  We  can  let  each  debugger  find  about  45  errors,  or  let  one  find  50  and  the 
other  40.  Figure  2(c)  shows  qualitatively  that  the  results  will  be  about  the  same.  We 
can  also  let  s and  t be,  say  20  and  100  and  expect  a somewhat  smaller  but  we  will 
have  to  wait  considerably  longer  for  the  results. 

The  cost  beyond  that  for  the  debugging  which  would  have  to  be  done  anyway  would 
be  identical  for  all  choices  since  the  additional  cost  is  only  for  the  common  bugs  and  the 
expected  number  of  those  is  st/N  = 13.  33. 

The  situation  would  be  a little  different  if  the  program  were  not  to  be  completely 
debugged.  The  test  could,  for  example,  be  a means  of  comparing  different  program- 
ming techniques.  In  that  case,  it  would  not  only  take  longer  but  would  also  be  more 
expensive  to  find  120  bugs  than  to  find  90. 

Rome  Air  Development  Center 

F30602-74-C-0294  B.  Rudner 
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MODULAR  PROGRAMMING  TECHNIQUES 
E.  Llpshitz,  M.  L.  Shooman  and  H.  Ruston 


The  objective  of  this  effort  is  to  find  ways  to  write  more  reliable  programs.  To 
this  end  we  raise  the  questions  in  the  following  three  areas; 


(1)  Computer  Language:  Can  we  develop  a high  level  language  that  is  less  sus- 
ceptible  to  bugs  ? Can  we  prove  that  any  one  of  the  existing  languages  is 
better  in  the  sense  of  having  less  bugs  ? 

(2)  Structural  Programming  and  Complexity  of  Programs:  Can  we  derive  a se- 
quence  of  steps  in  writing  prograrns  that  will  result  in  more  reliable  programs? 
Is  there  a correlation  between  the  structure  and  content  of  a program  and  the 
number  of  bugs  in  it?  If  yes,  which  areas  or  statenients  or  programming 
techniques  are  more  bug -manifested  ? Can  we  eliminate  or  replace  them? 

(3)  Automatic  Programming:  Will  the  use  of  pre-written  and  tested  modules  of 
code  reduce  the  number  of  bugs  ? How  can  they  be  best  incorporated  into  the 
program  ? 

This  project  will  investigate  both  structural  and  automatic  programming  with  the 
emphasis  on  the  latter. 


A,  The  Program:  Auto-Programming 

The  program  "Auto-Programming"  is  divided  into  two  parts  --  "Flow"  and  "Auto," 
both  of  which  are  interactive  on-line  programs  that,  by  communicating  with  the  user, 
generate  his  program.  Currently,  they  are  written  and  generate  programs  in  Fortran. 

1.  The  Program  Flow 


"Flow"  receives  the  information  about  the  flowchart  of  a program  from  the  user. 
"Flow"  recognizes  only  four  different  types  of  blocks  which  are  sufficient  to  generate 
any  flowchart.  They  are: 


(1)  Control  Block.  This  is  a conditional  decision  block,  similar  to  the  statement 
IF  ( ) GO  TO The  user  will  write  the  code  for  this  block. 


(2)  Functional  Block.  This  block  will  perform  a task  that  is  available  in  the 
computer  library.  The  program  "Auto"  will  generate  the  correct  code  for 
this  block. 
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(3)  Stop  Block.  This  block  indicates  the  end  of  the  path;  and  is  coded  by  the 
STOP  statement. 


(4)  User's  Code  Block.  The  user  inserts  the  desired  code  into  this  block.  This 
block  is  used  whenever  the  library  cannot  perform  the  needed  task. 

i 

Figure  1 illustrates  how  the  above  blocks  can  be  used  to  construct  the  flow  chart 
of  a given  program.  The  program  first  builds  a table  "Clas"  with  the  names  of  the 
students  in  a class,  and  then  enters  their  grades  into  the  table  "Grad.  " 

The  flow  chart  generated  using  only  the  above  blocks  is  always  a binary  tree  in 
structure,  while  the  transfer  of  control  during  the  execution  might  behave  like  a graph. 

Upon  completing  the  flow  chart,  control  passes  to  the  pr  t^^r.^:'..  Auto,  " to  be  dis- 
cussed next. 

2.  The  Program:  Auto 

The  user  will  specify  to  "Auto"  what  he  would  like  to  do,  and  "Auto"  will  advise 
him  which  methods  are  available  for  the  solution,  as  well  as  their  characteristics.  The 
user  then  will  choose  the  method  he  prefers,  and  "Auto"  will  generate  the  needed  code. 
"Auto"  is  divided  into  four  parts: 

(1)  Dictionary,  which  advises  the  user  of  tasks  available  in  the  computer  library. 

(2)  Communicant,  which  supplies  and  receives  from  the  user  all  the  information 
needed  to  generate  the  code.  In  addition,  it  will  sort  the  labels  generated  by 
the  user . 

(3)  Code  generator,  which  uses  information  obtained  with  the  help  of  the  com- 
municant, and  generates  the  final  code.  It  will  resolve  any  conflicts  in  the 
labels  generated  by  the  user,  and  the  duplicates  generated  by  "Auto.  " 

(4)  Library,  which  contains  a collection  of  code  modules  needed  for  the  different 
tasks,  e.g.,  input/ output,  solution  of  mathematical  problems , inventory  and 

■ banking  programs,  etc. 

"Auto"  was  revised  to  allow  dynamic  label  generation.  The  communicant  scans 
the  user's  statements  looking  for  labels.  The  labels  found  are  stored  in  a table  "Lab" 
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Fig.  1.  Record  of  student's  grades. 

in  descending  order.  Before  the  code  generator  writes  a label  it  searches  "Lab"  to  see 
whether  this  label  is  already  in  use. 

The  library  is  being  expanded.  Code  has  been  written  and  tested  for  the  follow- 
ing mathematical  tasks: 


L(x)=2  V i(^)" 
n'  ' -J  , n ' x+ 1 ' 

n= 1 , 3, 5 


m n 
n! 

n=  1 


sin(x) 


= L -(2n+Dl 

n=  0 


m , , .n  2n 
COs(x)  = ^ r^TTTT — 


. 5 < X < 2 . 0 


-2.  0 < X < 2.  0 


-2it  < X < 2-ir 


-2it  < X < 2it 


(1) 

(2) 

(3) 

(4) 
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2x4x6 
X 3 X 5 


m is  such  that  a < 10  ^ and  a > 10 

m — m - 1 

The  following  is  an  example  of  code  generating  for  a program.  The  program  is 
the  same  as  in  Figure  1. 


Figure  2 shows  the  interaction  needed  between  "Flow"  and  the  user  to  generate 
his  flow  chart,  as  well  as  the  flow  chart  generated. 


Figure  3 shows  the  communication  between  "Auto"  and  the  user.  "Auto"  either 
allows  the  user  to  input  his  own  code  or  to  enter  the  information  it  needs  to  generate 
the  code. 


Figure  4 is  the  final  code  "Auto  Programming"  has  generated.  It  is  important 
to  notice  that  the  final  code  uses  the  user’s  variable  names  and  their  correct  size,  and 
that  the  labels  generated  by  "Auto"  do  not  conflict  with  that  generated  by  the  user. 

B.  Conclusion 


The  "Auto-Programming"  package  offers  the  following  three  advantages: 

(1)  Ease  of  Operation:  The  software  package  is  self-explanatory.  The  user  needs 
to  know  only  how  to  gain  access  to  the  system.  Once  a connection  is  estab- 
lished, the  system  will  ask  the  user  for  all  the  information  needed  to  generate 
the  correct  program. 

(2)  Reliability:  The  code  stored  in  the  library  will  be  pre-tested  and  debugged. 

(3)  Time  and  Cost:  All  indications  are  that  both  time  and  cost  of  writing  and  de- 
bugging programs  are  reduced.  An  attempt  will  be  made  to  substantiate  the 
above  statement.  The  example  discussed  in  this  report  required  10  minutes 
of  terminal  time  and  18  seconds  of  C.P.U. 


Rome  Air  Development  Center 
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WHAT  BLOCK  DO  YOU  WANT? 
INPUT  1 FOR  CONTROL  BLOCK 
INPUT  2 FOR  A FUNCTION  BLOCK 
INPUT  3 FOR  STOP  BLOCK 
INPUT  4 FOR  OTHER  BLOCK 
INPUT  5 FOR  END  BLOCK 
INPUT  6 FOR  AN  ERROR 
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EACH  UNE  ENDS  WITH  ='■' 

LAST  UNE  ENDS  WITH  + 

INTEGER  CLAS  (10)* 

INTEGER  GRAD  (10)* 

READ  (5,  3)  J* 

3 FORMAT  (Il)+ 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

IF(J.  EQ.  1)  GO  TO  7+ 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

DO  2 1=  1 , 1 0* 

2 READ  (5,  5)  CLAS  (I)* 

5 FORMAT  (A4)+ 

INPUT  10  FOR  SORT 
INPUT  11  FOR  SEARCH 
10 

WHICH  SORT  DO  YOU  WANT 
INPUT  13  FOR  INTERCHANGE  SORT 
INPUT  12  FOR  SHELL  SORT 
INPUT  14  FOR  BUCKET  SORT 
12 

INPUT  TABLE  NAME,  SIZE 
CLAS  10 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

7 READ  (5,8)  NAME,IGRD* 

8 FORMAT  (A4,I3)+ 

INPUT  10  FOR  SORT 
INPUT  11  FOR  SEARCH 

1 1 

INPUT  15  FOR  UNEAR  SEARCH 
INPUT  16  FOR  BINARY  SEARCH 
15 

INPUT  TABLE  NAME  SIZE  AND  WORD  BEING  SEARCHED  FOR,  LOCATION  INDEX 

CLAS  lONAMEK 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

GRAD(K)=IGRD" 

READ(5,  9)J* 

9 FORMAT(Rl)+ 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  -t 

IF(J.  EQ.  121)  GO  TO  10+ 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

GO  TO  7+ 

EACH  UNE  ENDS  WITH  * 

LAST  UNE  ENDS  WITH  + 

10  DO  12  1=1,  10* 

12  WRITE  (6,  15)  CLAS(I),  GRAD(I)* 

15  FORMATdH  ,A4,10X,I3)+ 

Fig.  3.  Illustration  of  the  communication  between  the 
program  auto  and  the  user. 


i n 


490  SAFETY,  RELIABILITY  AND  SOFTWARE  ENGINEERING 


INTEGER  CLAS(IO) 

INTEGER  GRAD(IO) 

READ{5,  3)J 

3 FORMAT  (II) 

IF(J.  EQ.  1)  GO  TO  7 
DO  2 1=  1 , 1 0 

2 READ(5,  5)  CLAS(I) 

5 FORMAT(A4) 

C SHELL  SORT 

C INITIALIZE 

N=10 

ID=N 

6 ID=(IDf  l)/2 
1=1 

I1=ID+I 

IFLG=0 

C COMPARE,  REPLACE,  SET  FLAG 

11  IF(CLAS(I),  LT.CLAS(ID)  GO  TO  1 

ITEMP=CLAS(I) 

CLAS(I)=CLAS(I1) 

CLAS(I1)=ITEMP 

IFLG=1 

C PICK  UP  NEXT  PAIR  IF  NOT  REACHED  END  OF  TABLE 

1 11=11+1 

IF(Il.GT.N)  GO  TO  4 
1=1+1 

GO  TO  11 

C IS  TABLE  SORTED 

4 IF(IFLG.GT.  0)  GO  TO  6 
IF(ID.NE.  1)  GO  TO  6 
STOP 

7 READ(5,8)  NAME,IGRD 

8 FORMAT(A4,I3) 

C LINEAR  SEARCH 

DO  1 6 1=  1 , 1 0 

IF(CLAS(I).  EQ.NAME)  GO  TO  14 
16  CONTINUE 
WRITE  (6,  13) 

13  FORMAT(lH  , 6X,  SEARCH  FAIL) 

STOP 

14  K =I 
GRAD(K)=IGRD 
READ(5,9)J 

9 FORMAT(Rl) 

IF(J.  EQ.  121)  GO  TO  10 

GO  TO  7 

STOP 

1 0  DO  1 2 1=  1 , 1 0 

12  WRITE(6,  15)CLAS(I),GRAD(I) 

15  FORMAT(lH  ,A4,  10X,I3) 

STOP 

END 

STOP 

Fig.  4.  The  final  code. 
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IMPLEMENTATION  OF  SHOOMAN'S  MODEL  OF  EXHAUSTIVE  TESTING:  AN  AUTO- 
MATIC TYPE  l.A  TESTER 

D.  L.  Baggi  and  M.  L.  Shooman 

In  an  internal  paper,  "Analytical  Models  for  Software  Testings,  " Martin  L. 
Shooman  describes  a scheme  for  implementing  a driver  program  to  automatically  test 
each  path  of  a given  program.  An  implementation  of  this  scheme  in  PL/l,  with  a few 
revisions,  is  described  here,  along  with  two  examples  of  programs,  which  were  run 
with  normal  analytical  debugging  techniques,  i.e.,  with  some  testing  data  --  and  through 
the  testing  program.  Comparisons  among  man-hour  efforts  and  computer  time  in  both 
cases  are  made. 

A.  The  Testing  Driver  Program 

A program,  referred  to  as  driver  program,  has  been  developed  and  run  in  con- 
junction with  two  programming  examples.  Its  purpose  is  to  allow  automatic  testing  of 
all  possible  paths  of  any  given  program.  A description  of  its  functioning  follows. 

The  driver  program  requires  a data  card  containing  an  integer,  N-TESTS,  i.e., 
the  number  of  IF  statements,  plus  the  number  of  repetitive  DO  groups,  in  the  program 
and  subroutines,  to  be  supplied  by  the  programmer. 

The  next  data  item  has  to  be  an  "order,  " i.e.  , a character  string  such  as  'NOR- 
MAL OPERATION,  ' or  'TEST,  ' or  any  other  string.  If  the  order  is  'NORMAL  OPER- 
ATION, ' then  the  driver  allows  normal  functioning  of  the  program  to  be  tested,  e.g.  , 
with  a set  of  data  designed,  by  the  programmer,  to  test  some  cases  --  a normal  debug- 
ging practice. 

If  the  order  is  'TEST,  ' no  data  set  is  needed  for  the  tested  program,  but  an  array, 
T,  with  lower  bound  1 and  upper  bound  N-TESTS  (the  number  of  tests,  as  read  in  pre- 
viously), will  be  constructed  to  represent,  in  ascending  order,  all  possible  bit  combina- 
tions of  binary  numbers  from  0 up  to  2**  N-TESTS  - 1 ; this  array  is  called  testing  word, 
and  it  thus  consists  of  the  bits  of  a binary  counter  with  N-TESTS  bits.  Notice  that  the 
value  0 is  fact  represented,  in  the  corresponding  T,  by  -1,  while  1 is  represented  by  1. 
Eventually  the  program  to  test  is  run  for  any  such  binary  combination. 

For  any  other  order  string,  such  as  'ENOUGH,  ' as  well  as  in  the  case  of  absence 
of  data,  the  whole  system  stops. 

B.  The  Tested  Program 

Although  no  particular  care  has  been  taken  to  make  sure  that  the  driver  program 
is  fully  compatible  with  all  possible  programs  to  test  it  is  believed  that,  at  its  present 
state,  the  invariant  part  is  flexible  enough  to  accept  a large  class  of  programs  with  no 
modification,  requiring  for  other  programs  only  minor,  sensible  changes. 
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Shooman  indicates  (in  pg . 5-1  of  his  paper),  a strategy  for  implementing  the 
driver  program,  namely,  a revised  way  of  writing  IF  statements  and  DO  loops  in  a 
program  to  be  submitted  to  testing;  however,  since  such  schemes  lack  generality  (i.e.  , 
only  conditions  of  the  type  "exp  > 0"  are  allowed  in  IF  statements,  and  only  limits  from 
1 up  to  an  upper  bound  > 0 in  DO  loops),  the  scheme  described  here  has  been  developed 
as  a natural  derivative  of  these  suggestions;  hence,  the  only  restrictions  to  be  obeyed 
in  writing  a program  will  be  the  following: 

(1)  a)  instead  of  IF  cond  THEN  statement^;  ELSE  statement2; 

write  IF  F(cond)  THEN  statement^;  ELSE  statement^; 

b)  instead  of  DO  I = LIMIT  1 TO  LIMIT  2 BY  INCR; 

write  DO  I = GL  (LIMIT  1,  UMIT  2)  TO  GH  BY  INCR; 

c)  instead  of  DO  WHILE  (cond); 

write  DO  WHILE  (H(cond)); 

(where:  F,  GL,  GH  and  H are  described  in  a forthcoming  technical  report.) 

(2)  function  and  subroutine  procedures  are  possible  but  should  be  internal  to  the 
program 

(3)  variables  used  in  the  program  which  are  assigned  a value  through  a read(GET 
LIST)  statement  should  be  initialized,  for  instance  through  a DCL  INIT  state- 
ment. 

All  these  restrictions  could  be  removed.  To  remove  (1),  one  could  construct  a 
subprogram  in  the  operating  system  which  automatically  supplies  F,  GL  and  GH,  and 
H.  Subroutines  could  be  external,  as  long  as  a mechanism  is  provided  for  passing  back 
and  forth  T and  K,  e.g.  , COMMON  statements  in  FORTRAN,  hence  removing  (2). 

Point  (3)  is  a direct  consequence  of  the  read-in  scheme  described  in  I,  which  could  be 
modified  at  will. 

C.  The  Two  Examples 

For  illustration  two  programs  were  chosen  at  the  two  ends  of  a spectrum:  one 
with  many  IF  statements,  and  input  data,  and  the  other  with  DO  groups  and  one  sub- 
routine, and  no  input  data. 

1.  First  Example:  Computer  Solution  of  a Card  Game 

This  is  a very  slightly  altered  version  of  Shooman's  algorithm  of  Fig.  4-3  in  his 
paper.  The  algorithm  appears  in  Figure  1 here.  It  determines  the  winner  of  a card 
game,  in  which  player  A is  dealt  two  cards,  A1  and  A2,  and  player  B,  similarly,  gets 
B1  and  B2,  (four  integers  read  in  with  a GET  LIST  statement).  If  both  winners  have  a 
pair,  the  highest  pair  wins,  or  if  they  are  equal  it  is  a tie;  if  only  one  player  has  a pair, 
he  wins;  otherwise,  the  highest  card  wins,  or  if  they  are  equal,  the  highest  second  card 
wins;  identical  hands  are  ties.  At  first  the  system  was  run  under  'NORMAL  OPERA- 
TION' conditions.  The  results  are  shown  in  a forthcoming  report.  The  system  was 
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next  run  under  'TEST'  conditions.  The  tested  program  has  12  IF  statements,  hence  we  j 

have  a 12 -bit  testing  word.  i 

Since  the  program  is  fully  debugged,  no  error  can  be  seen  in  the  output  listing; 
should  one  error  appear,  however,  it  would  be  easy,  from  the  testing  word,  to  recon- 
struct the  path  and  detect  the  deficient  statement. 

2.  Second  Example:  A Program  Which  Prints  the  Prime  Factors  of  All  | 

Integers  from  1 to  100  j 

This  is  a simple  program  which  tests  each  integer  from  1 to  100,  prints  it,  and 
its  prime  factors,  or  the  word  PRIME  if  it  is  prime.  Its  algorithm  is  presented  in  j 

Figure  2.  ! 

j 

The  internal  subroutine  PRINTOUT  prints  the  results;  this  procedure  has  been  j 

incorporated  to  show  the  generality  of  the  scheme  including  subprograms.  Notice  that,  1 

although  it  is  called  only  once  from  the  main  program,  it  could  be  invoked  as  many 
times  as  needed,  because  of  the  design  of  the  internal  procedures  of  the  driver  program,  ! 

which  know  by  themselves  how  to  select  a new  bit  in  the  testing  word  each  time  they  j 

. I 

are  used.  i 

D.  Efficiency  of  the  System 

Program  Example  A i 

(1)  It  took  30  minutes  to  design  the  program 

(2)  It  takes  no  extra  time  to  redesign  a program  according  to  the  specifications  I 

expressed  in  Section  B | 

(3)  It  took  10  minutes  to  find  a data  set  to  test  some  well-chosen  paths  j 

(4)  The  program  ran  in  2.  41  minutes  under  NORMAL  OPERATION  with  that  set  ] 

of  data 

(5)  The  program  ran  in  4.  12  minutes  under  TEST  conditions,  exploring  all  paths. 
Program  Example  B 

(1)  It  took  20  minutes  to  design  the  program 

(2)  There  are  no  data 

• (3)  The  program  ran  in  3.81  minutes  for  100  integers 

(4)  The  program  ran  in  1.76  minutes  under  TEST  conditions. 

Hence  the  system  provides  the  following; 

Advantage 

No  time  is  spent  in  finding  a data  set  to  debug  a program. 

Disadvantage 

Running  time  may  increase  exponentially  with  the  number  of  IF  statements  and 
. DO  loops.  For  instance.  Program  A contains  12  IF  statements,  and  therefore  the 
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testing  word  has  13  bits  and  8192  runs  through  the  program  were  needed  to  test  its  100 
paths  (hence,  with  this  blind  mechanical  approach  many  tests  are  meaningless).  How- 
ever, if  the  program  becomes  too  large,  it  can  be  separated  in  portions  to  be  tested 
indi'ddually,  along  with  this  interconnecting  data  sets.  Furthermore,  this  disadvantage 
is  compensated  by  those  cases  of  programs  with  many  DO-loops  and  few  IF  statements. 
Program  B,  for  instance,  required  almost  four  minutes  for  a hundred  integers , and 
would  use  a lot  more  for,  say  10,  000  integers,  but  it  took  less  than  two  minutes  to  go 
through  all  paths  as  defined,  and  would  still  take  the  same  amount  of  time  no  matter 
how  many  integers  it  would  have  to  consider.  Hence  a TEST  run  is,  in  these  cases, 
very  time  saving. 

E.  Conclusions 

A possible  implementation  for  an  automatic  program  tester  of  Type  l.A  has  been 
discussed.  The  tester  ignores  the  semantics  of  the  tested  program,  but  is  able  how- 
ever to  run  through  all  possible  paths  present  in  a program  and  catch  a possible  error. 

This  implementation,  rather  than  representing  an  ultimate  result,  it  meant  to  be 
an  illustration  of  the  method  and  techniques  proposed  by  Shooman  in  his  paper;  it  would 
be  easy,  for  instance,  to  make  the  driver  program  more  flexible  or  suited  to  other 
styles  or  programming  languages. 

It  was  rewarding  to  discover  that  even  within  the  limited  development  of  these 
techniques  some  goals  have  been  achieved,  namely,  the  realization  of  a system  which 
has  already  proved  its  usefulness  in  debugging  the  described  programs. 

Rom  e Air  Development  Center 

F30602-74-C-0294  D.  L.  Baggi  and  M.  L.  Shooman 
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OPTIMAL  INSPECTION  SCHEDULES  FOR  FAILURE  DETECTION  WHEN  TESTS 
HASTEN  FAILURES 

L.  Shaw  and  N.  Wattanapanom 

1 2 3 

Barlow,  Hunter  and  Proschan  ’ ' posed  and  solved  an  interesting  class  of  in- 
spection problems  related  to  reliability  and  maintainability  of  systems.  In  their 
models,  a system  operates  for  random  time  until  it  fails,  but  the  failure  can  be  detect- 
ed only  after  a test  costing  c^  has  been  performed.  For  example,  judgment  of  whether 
or  not  a radar  system's  detection  and  false -alarm  probabilities  are  acceptable  might 
reqviire  connection  of  special  test  equipment  and  injection  of  special  testing  signals. 

One  approach  to  this  problem  seeks  optimal  testing  times  to  minimize  a loss 
function  of  the  form 

L = EfcjN  + c^d]  (1) 

in  which  N is  the  number  of  tests  until  the  first  one  after  the  failure,  and  d is  the  time 
between  the  failure  time  and  the  subsequent  detection  time  t^^.  Re:>rence  3 gives 
algorithms  for  finding  the  optimal  inspection  times  L for  lifetime  distributions  which 
are  uniform,  exponential  or  Polya  of  order  2,  when  using  the  single-cycle  (no  renewal) 
loss  function  in  Equation  (1). 

A second  point  of  view  postulates  repair  or  replacement  after  each  failure  detec- 
tion, with  associated  average  cost  and  average  repair  time  requirements.  In  this  case 

4 

the  performance  is  judged  using  the  mean  cost  per  unit  time  or  equivalently  the  ratio 
of  mean -cost-per -renewal  to  mean-time -between-renewals  . Reference  3 shows  how 
techniques  used  for  the  single-cycle  problem  can  be  modified  to  find  the  best  testing 
times  for  the  case  with  renewals. 

Here  we  seek  similar  solutions  but  include  the  possibility  that  the  mechanical  and 
electrical  stresses  of  the  test  might  destroy  or  damage  the  system.  Mathematically, 
we  allow  the  ith  test  to  destroy  the  system  with  probability  |3  or,  with  probability  (1  - P), 
to  increase  the  failure  rate  (reciprocal  of  mean  time-to-failure)  to  ^ without 

changing  the  form  of  the  conditional  lifetime  distribution.  These  generalizations  are 
approached  in  a direct  manner  by  using  dynamic  programming  to  set  up  recursive 
optimization  equations.  The  main  contribution  here  is  the  presentation  of  convergent 
algorithms  for  solving  the  optimization  equations.  Sensitivity  of  inspection  policy 
performance  to  uncertainty  in  failure  rate  and  degradation  parameters  is  also  studied. 

The  initial  phases  of  this  work,  dealing  with  single  cycle  (no  renewal)  cases  for 
exponential  and  uniform  conditional  lifetime  distributions,  were  summarized  in  last 
year's  research  summary.^  Here  we  show  how  those  results  generalize  to  the  case  in 
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which  the  failed  part  is  replaced  after  each  failure  detection,  and  the  lifetime  distribu- 
tions are  exponential.  More  details  about  this  case,  as  well  as  those  for  other  condi- 
tional distributions  can  be  found  in  Reference  6. 

When  the  mean  cost  for  component  replacement  and  mean  time  to  accomplish  that 

replacement  are  denoted  by  s and  r,  respectively,  then  the  ith  cycle,  starting  at  time 

tp  and  ending  at  , will  accrue  a mean  cost 
i i 

C.  = N.c,  + d.c^  + s 
1 1 1 1 2 

and  will  have  a duration 
"^i  ■ ■ *0.^  ^ 

X 1 


(2) 

(3) 


Using  these  terms  the  average  cost  is  defined  as 
k 

X s 

R(6)^  Ldm  

y j 

ik  i 


(4) 


The  argument  6 is  shown  to  emphasize  the  dependence  of  this  average  cost  on  an  inspec- 
tions policy  6 = {6q,6j,  • • • ]• 

It  is  well  known‘s  that  R(6)  can  be  expressed  in  terms  of  the  mean  cost  per  cycle 
U and  the  mean  cycle  duration  T,  as 

R(6)  = U/ T (5) 

Both  the  numerator  and  denominator  of  Eq.  (5)  are  affected  by  the  inspection  policy  6. 
However,  it  is  possible  to  find  the  6 which  minimizes  R(6)  by  considering  an  auxiliary 
optimization  problem  having  the  cost  function 

£(H,b)  =C  - liT  (6) 


These  two  problems  are  related  as  follows. 

Theorem:  If  there  exists  a (x  = fi  for  which  min  £ (fx  ,6)  = £ (jx  ,6((x  ))  = 0,  then 

^ 6 
the  schedule  6(/x  ) also  minimizes  R(6). 

The  existence  of  such  a desired  fx  is  quite  evident  for  the  present  problem. 
Equation  (6)  can  be  rewritten  as 


L(fx,  6)  = c + c^d  + s - fx[  T + r] 


a little  thought  shows  that  when  jx  > c^t  increases  in  will  decrease  £ since  T > d. 


(7) 
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Thus,  for  p > the  optimal  policy  is  to  make  no  inspections  or  repairs,  with  a cor- 
responding optimal  jC  < 0.  On  the  other  hand,  when  p = 0,  JC  > 0 for  any  policy  --  includ- 
ing the  optimal  one.  We  conclude  that  JC(fi,6,(p))  changes  from  a positive  value  at  p = 0 
to  a negative  one  for  p = c^*  and  this  change  must  be  a continuous  one  passing  through 
zero  at  least  once  in  view  of  the  smooth  nature  of  all  densities  and  cost  terms  entering 
into  the  loss  functions. 

An  algorithm  which  uses  the  Theorem  to  optimize  inspection  time  in  the  presence 
of  renewals  consists  of  two  parts.  An  algorithm  like  the  single -cycle  one  in  Ref.  5 is 
needed  to  minimize  Eq.  (7)  for  each  choice  of  p until  JC(p,6(p))  = 0.  The  first  part 
parallels  the  approach  in  the  single-cycle  case  since  £ defined  in  Eq.  (7)  is  a single 
cycle  loss  with  p similar  to  c^. 

In  particular,  Eq.  (7)  leads  to  the  following  recursive  expression  for  the  minimal 
mean  future  loss  after  test 

-X  6 

L°  = min  {c^+  (c^ -p)6^ -c^/X^  + (s  -pr)  + e '^(L°^  ^ - s + pr  + c^/X^)  } (8) 

Differentiation  yields  the  recursive  optimization  equations. 

(9) 

(ii)  = Cj  - p/X^+  (s -pr)  + (c^ -p)6° 

It  can  be  shown^  that  the  p < c^  condition  derived  above  insures  that  the  zero-derivative 
condition  corresponds  to  a minimum  and  that  all  6°  > 0.  Finally,  the  Lemma  in  the 
single  cycle  case  must  be  modified  for  the  present  loss  function  to 

Lemma:  Lim  ,C°  = c,  - pr  + s (10) 

K-^oo  X ^ 

Once  the  p of  the  Theorem  has  been  found,  it  follows  immediately  that 

min  R(6)  = p (11) 

6 

Sensitivity 

The  preceding  paragraphs  describe  how  to  get  the  best  inspection  schedule  6 and 
the  corresponding  minimal  loss  per  unit  time.  It  is  also  of  interest  to  see  how  sensitive 
this  solution  procedure  is  to  the  precision  of  knowledge  of  model  parameters,  e.g.  , 

Xq,  and  how  sensitive  it  is  to  the  accuracy  of  the  iterative  calculations.  It  is  straight- 
forward to  use  the  exponential  conditional  densities  to  get  iterative  expressions  for  hT, 
d and  which  can  be  terminated  when  the  additional  terms  contribute  very  little.^ 
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Examples 

Tables  I,  II  and  III  show  results  for  numerical  examples  based  on  the  models  in 
this  section.  Table  I shows  the  small  sensitivity  of  R(6)  to  use  of  non-optimal  testing 
schedules  that  result  when  fx  f ^ . The  parameters  for  Tables  II  and  III  differ  only  in 
that  the  latter  has  non-zero  values  for  r and  s,  the  renewal  costs.  Both  of  those  tables 
show  minimum  R values  for  several  as  well  as  the  performance  degradation  when 
\q  = 5 but,  when  the  optimum  6 -schedule  for  a different  (mismatched)  \q  value  is  used. 

TABLE  I.  Mean  loss  rate  minimization  - exponential  case 
Cj=1.0,  C2=20.0,  s = 0,  r=0,  \^=  5.  0,  M=  21 , p=0.9,  k=  2.0,  and  = X^/ 


R 

10 

12. 72591 

11 

12. 67058 

12 

12. 63822 

p*  = 12.  63183 

12. 63200 

13 

12. 63429 

14 

12. 66670 

15 

12. 74756 

16 

12.89658 

17 

13. 14914 

18 

13.57903 

19 

14. 39155 

TABLE  II.  Minimum  loss  rates  - exponential  case;  sensitivity  to  X^  error 
Cj=1.0,  0^=20,  s=0,  r=0,  p = 0.  9,  M=21,  and  X^  = X^/ p*^ 


^0 

1^(5)  . , u 

mismatch 

R . 
min 

2 

13. 05727 

8. 68520 

3 

12. 75710 

10. 27669 

4 

12. 65410 

11. 55473 

5 

12. 63183 

12. 63200 

6 

12. 64438 

13. 56575 

7 

12. 66983 

14. 38974 

8 

12. 69736 

15. 12581 

9 

12. 72094 

15. 78896 

10 

12. 73731 

16. 38998 
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TABLE  III.  Mean  loss  rates  - exponential  case; 

sensitivity  to  error  - non  zero  renewal  costs. 

c^=1.0,  0^=20. 0,  s=1.2,  r=.001,  p = 0.9,  M=21,  and  \^= 


^0 

r(5) 

mismatch 

R . 
rmn 

2 

16. 28924 

10. 62738 

3 

16. 22188 

12. 87934 

4 

16.  21368 

14. 70045 

5 

16. 21360 

16. 21360 

6 

16. 24733 

17. 47788 

7 

16.38830 

18. 52204 

8 j 

16.38830 

19. 35293 
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RELIABILITY  APPLICATIONS  OF  A MULTIVARIATE  EXPONENTIAL  DISTRIBUTION 
L.  Shaw  and  C-L.  Hsu 

Several  types  of  multivariate  exponential  distributions  have  been  proposed  for 

12 

various  reliability  applications.  ’ These  are  joint  distributions  for  which  the  uni- 
variate marginal  distributions  are  negative  exponentials.  The  present  work  has  exam- 
ined one  particular  class  of  these  distributions  for  both  modeling  of  downtime  distribu- 
tions and  modeling  of  stages  of  component  deterioration. 

The  n non-independent  exponential  random  variables 
^^2’  •••’  ^n 

can  be  defined  with  respect  to  two  sets  of  normal  random  variables.  In  particular,  let 
w and  be  two  stochastically  independent  n-vectors,  each  being  normally  distributed 
with  a zero  mean-value  vector,  and  each  with  the  same  covariance  matrix  F.  It  is 
easy  to  show  that  defining 

2 2 

r.  = w + z.  (2) 

111  ' ' 

makes  the  r^  correlated  exponential  random  variables  with  mean  values 


E[r.]  = 2v.. 

X ’ll 


and  correlation  coefficients 


•’ij  = 


The  precise  form  of  the  corresponding  multivariate  density  will  depend  on  the 
type  of  r matrix  selected.  For  example,  if 


/ 2 2 li-jl 
y. . = ,/  o.  o.  o'  ' 
"ij  V 1 j ^ 


then  the  bivariate  and  trivariate  densities  are 


,,  . r.  2 2,,  Z.~l 

f(ri,  r^)  = [4o^  02(1-  p )J 


2(1-  p") 


O,  0,(1  - p ) 


■[(4 


ri  r2(l+P  ) T3 


+ 4)/2(l-p2)] 


„ 2 2 2,,  2^ 
8 O2  03(1  - p ) 


-o[ 


^2^3P 


CjC^d-p")-^  "^0303(1- p") 
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in  which  is  the  modified  Bessel  Function  of  order  zero. 

A.  Modeling 

The  design  of  maintenance  policies  for  maintainable  and  repairable  systems  makes 
use  of  information  about  probability  distributions  of  component  lifetimes  and  downtimes. 
Since  such  designs  are  facilitated  if  these  distributions  have  simple  analytic  forms, 
downtime  distributions  have  been  frequently  modeled  as  lognormal,  Weibull  or  Erlarg 
in  form  These  distributions  all  are  skewed  and  correspond  to  non-negative  random 
variables  like  downtimes.  Here  we  consider  one  more  family  of  distributions  which 
has  some  physical  motivation. 

Since  a downtime  interval  is  ofteu  the  sum  of  subsidiary  intervals  (for  failure 

isolation,  component  removal,  repair,  reassembly,  alignment,  etc . ) it  seem s r eason - 

able  to  think  of  the  downtime  x as  a sum  of  subsidiary  time  intervals 

n 


n 

X . ^ r. 

i=l  ^ 


(8) 


The  subscript  on  x reminds  us  of  the  number  of  summands.  Several  distributions  are 
^ n 

possible  for  the  individual  r^,  but  here  we  consider  exponential  distributions  which  are 
the  simplest  and  which  are  widely  used  to  represent  random  times  between  events. 

It  is  well  known  that  the  downtime  x will  have  an  Erlang  distribution  if  the  r.  are 

5 *■ 

independent  exponential  variables  with  identical  mean  values.  Muth  has  considered 
the  approximation  of  Weibull  and  lognormal  distributions  by  x^  in  which  the  r^  are  in- 
dependent exponential  variables  but  with  possibly  different  mean  values.  Here  we 
further  generalize  to  allow  dependence  among  the  r^^  - - a reasonable  situation  if  the 
variables  represent  related  steps  in  a sequence  of  downtime  operations.  In  particular, 
we  use  r.  with  the  multivariate  exponential  distribution  described  above. 

Figure  1 shows  some  possible  densities  for  x^  (i.e.  , when  n=  2 in  Equation  (8)). 
Each  density  shown  there  can  be  developed  in  two  different  ways.  One  approach  assumes 
r j and  r^  are  independent,  with  unequal  mean  values,  and  f(x2)  can  be  found  by  con- 
volution of  their  marginal  exponential  densities.  Alternatively,  when  the  r^^  are  not 
independent,  they  can  be  expressed  in  terms  of  their  normally  distributed  generators 


2 , 2,2^2 
x^  = Wi  + W2  + Zi  + ^2 


(9) 


(10) 


r 
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1 


V 


ii- 


(a)  (b) 


E[rj] = E[r2] = 2 

p = 0 

0.18 

j -0,5x2 

fx2(x2)  = 4^2e 

2 4 

i)  p = 0 

E[r-j]=2 

“)  P = -^ 

E[Pj]  = 2.5 

2 0.186 

-0,  4X-  - yX 

fx2(x2)  = e - e 

192  4 

iii)  P = 1 

E[?^]  = 3 
E[f2^=  1 

1.65  4 

iv)  p = 1 

E[f^]  = 3.5 
ELf^]  =0.5 

0.219 

fx2(x2)  = i (e  2 - e 

1 ' 1 — 

1.135  4 

v)  p = 1 

E[fj] = 4 

E[f2l = 0 

0.25- 

1 

1 ~ 4 *2 

fx2(x2)  = 

4 

(a)  Equal  mean  and  correlated 

(b)  Independent  with  unequal  means 


Fig.  1.  Densities  for  sums  of  exponential  variables. 

I an  be  computed  directly  using  the  properties  of  w^^  and  z_  For  example,  when  = 1 

and  p = p then 

^wjW2 

4>  (s)  = [4(1  - p^)  s^  + 4s  + 1]'^  (11) 

^2 

and  the  densities  in  Fig.  1 can  be  found  by  Laplace  inversion  for  various  values  of  p. 

The  equivalence  of  a sum  of  independent  exponential  random  variables  to  the  sum 
of  different,  but  correlated,  exponential  random  variables  suggested  the  following. 

Theorem;  All  possible  density  functions  for  x^  defined  in  Eq.  (2)  and  Eq.  (8)  can 
be  achieved  using  independent  r^^,  i.  e.  , with  diagonal  F = diag(Oj,  a^,  ■ • • , 

*T 

This  theorem  is  proved  in  Ref.  3 by  showing  that  with  F = MM-^  then 

(s)  = (|I  + 2 ^ (12) 
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It  can  be  argued  that  q(s)  is  an  n-th  degree  polynomial  with  real,  negative  roots  and 
q(0)  = 1.  Thus,  given  Eq.  (12)  for  any  arbitrary  F,  some  other  diagonal  can  be 
found  to  produce  the  same  q(s)  and  $ (s). 

In  general,  the  n-independent  exponential  variables  whose  sum  is  indistinguish- 
able from  the  sum  of  n-correlated  ones  will  have  different  mean  values  from  those  of 
the  correlated  variables . 

Once  this  structure  has  been  established  for  sums  of  correlated  exponential 

variables,  previous  results  for  sums  of  independent  variables  (e.g.,  those  of  Muth  ) 

are  directly  applicable.  However,  it  appears  that  when  equal  mean  variables  have  their 

mean  and  correlation  p in  Eq.  (11)  adjusted  so  x^  matches  the  lognormal  mean  and 

variance,  then  f„  tends  to  lognormal  for  large  n.  This  is  in  (.ontrast  to  Muth's  sum- 
n 

ming  of  independent  variables  in  which  he  had  to  search  for  the  proper  individual  mean 
values  which  made  the  sum  approximately  lognormal. 

B.  Deterioration  Modeling  for  Replacement  Schedule  Optimization 


One  model  frequently  used  to  describe  component  deterioration  assumes  that  a 
new  component  retains  its  good  qualities  for  a random  time  duration,  after  which  it 
performs  for  another  random  time  interval  with  reduced  quality,  etc.  The  quality  k is 
thus  assumed  to  take  on  a finite  set  of  values,  say  k = 0, 1,  • • • ,N,  with  k = 0 for  a new 
part  and  k=N  for  a worthless  part.  Luss  has  developed  optimal  inspection  schedules 
(for  observing  k)  and  replacement  rules  (for  resetting  k=  0)  with  respect  to  a specified 
reward  measure.  We  are  attempting  to  generalize  his  work  with  independent  exponential 
durations  in  each  stage,  by  using  the  correlated  exponential  variables,  defined  above, 
to  represent  the  durations  of  the  stages. 

In  the  simplest  case  we  have  assumed  that  a system  is  characterized  by  a single 
deterioration  state  k;  transitions  from  k=i  to  i+1  are  immediately  observable;  replace- 
ment (setting  k=  0)  is  mandatory  when  k = N;  rewards  are  decreasing  functions  of  k;  and 
the  desired  replacement  rule  must  maximize  the  average  reward  per  unit  time. 

Two  kinds  of  reward  structures  were  considered  for  a system  in  state  k = i; 

1.  Linear:  c^  dollars  per  unit  time 

C > C > C 

0 1 2 N 


2.  Quadratic:  c^^t  per  unit  time  after  t seconds  in  state  i 


In  the  linear  case  the  optimal  replacement  rule  is  to  replace  when  k reaches  some  k , 
independent  of  r^,  r • • • , the  durations  in  states  k=  0,  k=  1,  • • ■ ; and  independent  of 
the  correlations  between  those  durations. 
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In  the  quadratic  case,  dynamic  programming  arguments  have  shown  that  the  best 
replacement  rule  is  of  the  form:  replace  on  entering  k if  r , < r*  This  rule  re- 
fleets  the  positive  correlation  between  successive  durations.  A small  r^^  ^ yields 
predictive  information  t)i.jt  the  rate  of  deterioration  is  great.  It  is  anticipated  that 
future  work  will  show  that  the  decision  thresholds  are  ordered  according  to 

* # 

""k-l  ^k 


Derivation  of  the  quadratic  case  results  required  study  of  properties  of  the  con- 
ditional densities  of  the  r^^  sequence.  In  particular,  the  r^^  sequence  preserves  the 
Markov  property  of  the  underlying  gaussian  sequences  defined  by  Eq.  (5);  and  the  con- 
ditional variables  are  stochastically  increasing  in  the  sense  that  the  conditional  dis- 
tribution function  increasing  function  of  r^^  ^ for  every  r^^. 
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NEW  RESULTS  IN  SYSTEM  THEORY:  BAUER-TYPE  FACTORIZATION  OF  POSITIVE 
MATRICES  AND  THE  THEORY  OF  MATRIX  POLYNOMIALS  ORTHOGONAL  ON  THE 
UNIT  CIRCLE 

D.C.  Youla  and  N.N.  Kazanjian 
A.  Introduction 

Some  of  the  most  impressive  accomplishments  in  modern  system  theory  hinge  on 
the  solution  of  the  following  problem.  Consider  an  nxn  rational  matrix  G(s)  of  the 
complex  variable  s = a + jw  meeting  the  conditions 

(a)  G(s)  = G,.(s)  ; 

(b)  G(j<«))  > 0 , all  real  cj  ; 

(c)  det  G(s)  i 0 . 

Produce  an  effective  construction  for  the  nxn  rational  matrix  W(s)  which  is  analytic 
together  with  its  inverse  in  Re  s > 0 and  satisfies 

G(s)  = W^(s)W(s)  . (1) 

Clearly,  for  any  such  factor  W(s),  detW(s)  ^ 0,  Re  s > 0,  and  W(s)  is  said  to  be  mini- 
mum-phase. Many  frequency -domain  solutions  to  this  problem  have  appeared  since 

1-8 

1958  and  are  extensively  discussed  in  the  literature. 

Let 


G(s) 


G.  + G.S+...+G  s'" 

0 I m 

gQ+g^S+...+g^s" 


P(s) 

g(s) 


(2) 


where  the  G^'s  are  nxn  constant  matrices,  r = 0 m,  and  g(s)  is  the  least  common 
multiple  of  all  denominators  in  G(s).  Then,  g(s)  = g^^(s),  P(s)  = P,^(s),  g(jo>)  > 0 and 
P(ju)  > 0,  10  real.  As  a consequence,  m = 2 v and  n = 2p  are  both  nonnegative  even  integers . 
Using  the  mapping 


z 


1 - s 
1 + s ' 


(3) 


the  closed  right-half  s -plane  is  imaged  into  the  unit  circle  |z|  < 1 and  G(s)  is  transform- 
ed into 


The  complex  conju^ate^  transpose,  adjoint,  determinant  and  trace  of  a matrix 
A are  denoted  by  S,  A , A'^("A0i  det  A and  Tr  A,  respectively;  Ij,  is  the  identity  and 
0^  the  nxn  zero  matrix.  If  A = A’'',  A is  hermitian  and  if  A“'‘A=  Ij^,  A is  unitary.  By 
A-B  > 0 we  mean  that  A-B  is  hermitian  nonnegative -definite  while  A-B  > 0 signifies 
that  it  is  positive -definite.  For  a rational  matrix  A(s),  A;;-,(s)  ^ A’^(-F).  Evidently,  if 
A(s)  is  real  for  real  s,  A>;c{s)  = A^(-s)  and  always  A:',t(jto)  = A“‘(j(o),  «o  real.  If  A(s)  = 

A.jt(s),  A(s)  is  paraconjugate  hermitian  and  if  Aj;:(s)A(s)  = 1„,  it  is  paraconjugate  unitary. 
(In  the  special  case  that  A(s)  is  real  A(s)  is  said  to  be  pa rahermitian  or  paraunitary.) 


I 
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,n-m  N(z) 

+ ^ • 

N(z)  and  d(z)  polynomial.  Condition  (a)  forces 
(a^)  R(z)  = R^(z) 

>ic  

where  now  Ajj,(z)  =A  (1/z),  (b)  requires 
(bj)  R(e”'^®)  > 0,  all  real  9 

and  of  course 

(c  det  R(z)  i 0 . 

In  terms  of 


K(z)  = 


k(z)  = ^ 


we  can  write 


and  (a^),  (b^),  (c^)  translate  into 

(a^)  K(z)  = K^(z)  ; K(e‘->®)  > 0 
(b^)  k(z)  = K*(z)  ; k{e''’®)  > 0 

(c^)  k(z)  • detK(z)  ^ 0 


K(z)  = B^(z)B(z) 


k(z)  = b*(z)b(z) 


where 


B(z)=Bq+Bj^z+...+B^z  , 


b(z)  = b-  + b,z+  . . . + b z^ 


(14) 
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b(z)  • detB(z)  0 , |zl  < 1 . 


R(z)  = H{z) 


in  which 

is  obviously  analytic  together  with  its  inverse  in  jz]  <1.  Thus,  transforming  back, 

W(s)=H({^) 

and  the  construction  of  W(s)  has  therefore  been  made  to  depend  on  the  availability  of  a 
procedure  for  minimum -phase  polynomial  factorization  of  self-inver sive  matrix  poly- 
nomials in  the  variable  z which  are  nonnegative -definite  on  the  boundary  of  the  unit 
circle . 

At  first  glance,  this  reformulation  of  the  problem  in  the  z-domain  appears  to  be 
of  little  significance  since  the  slight  advantage  gained  by  working  in  the  compact  set 
jzl  < 1 instead  of  Re  s > 0 is  more  than  offset  by  the  need  to  carry  out  the  transforma- 
tion G(s)  R(z).  Nevertheless,  Bauer‘S  in  1955  exploited  the  above  correspondence  in 
the  scalar  case  to  exhibit  a novel  technique  well  suited  to  computer  implementation. 
Quite  recently,^  ° Kailath  gave  a partial  generaUzation  of  Bauer's  method  to  matrices 
by  employing  a functional-theoretic  argument  revolving  around  von  Neumann's  theorem 
on  alternating  projections.^^  However,  his  proof  is  vahd  only  under  the  unnecessarily 
restrictive  assumption  det  R(e‘j®)  > m,  m a positive  constant.  Our  objective  in  this 
report  is  to  supply  an  elementary  (and  highly  informative)  proof  of  this  matrical  Bauer 
extension  in  its  most  general  setting  and  to  define  and  derive  explicit  formulas  for  a 
new  class  of  associated  matrix  polynomials  orthogonal  on  the  unit  circle.  The  intimate 
connection  with  the  factorization  problem  is  established  by  a detailed  study  of  their 
limit  properties. 

Before  embarking  on  section  B we  summarize  for  the  reader's  benefit  several 
useful  results  of  a purely  mathematical  nature. 

Let  A and  B be  two  nxn  hermitian  nonnegative -definite  matrices.  Then, 
det(A+ B)  > det  A+ det  B . 

In  addition,  if  at  least  one  of  A and  B,  A say,  is  nonsingular,  equably  attains  in  Eq. 
(19)  iff  B = \A,  X a nonnegative  scalar  = 0 for  n > 1 (Reference  12).  As  an  interesting 
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application,  suppose  A - B > 0 and  det  A = det  B 0.  Clearly,  from  A = (A  - B)  + B and 
Eq.  (19)  we  conclude  that 

det  A > det(A  - B)  + det  B = det(A  - B)  + det  A . 

. ■ . det(A-  B)  = 0 


A - B = XB  giving  det  A = (I  + X)  • det  A 

Hence  X = 0,  A = B and  it  is  seen  that  under  certain  conditions  the  equality  of  the  deter- 
minants of  two  matrices  implies  equaUty  of  the  matrices.  More  generally,  Eq.  (19)  is 
included  in 


■ > D^det  A +*i/det  B"  , 


12 

Minkowski's  inequality. 


The  Master  Inequality.^ Let  the  nxn  matrix  K(9)  be  hermitian  nonnegative- 
definite  and  Ljji.e.,  K(0)>Oa.e.  with  entries  absolutely  integrable  over  -ir  <9,<ir. 


Then, 


In  det  ^ / K(9)d9  > ^ J K(9)d0  > -<»  . (21) 

-TT  -■n’ 

Proof.  Obviously,  K(9)  e L^  implies  I) det  K(9)  e Lj  and  since  integration  is  a sum- 
mation, Eq.  (20)  yields 

/det  ^ / K(9)d9  > ^ / *^/det  K(9)d9  . (22) 

V ^ -IT  -IT 

Taking  logarithms  on  both  sides  and  using  the  inequaUty  of  the  arithmetic  and  geometric 
means  ^ ^ we  get  the  desired  result  Eq . (21),  Q.E.D, 

An  nxn  matrix  function 


F(z)  = y; 

r=0 


is  said  to  be  of  class  (Ref.  13,1)  if 

^ Tr  {F*(re'j®)F(re'j®))d9 

-IT 

is  bounded  for  all  r < 1.  This  requirement  is  equivalent  to 
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^ Tr(F*  F ) < ao  . (24) 

r=0 

Such  a function  is  analytic  in  jz|  <1  and  its  radial  interior  limit 

F(e  ■^®)  5 limit  F(re  •’®)  (25) 

r-»  1 -0 

exists  for  almost  all  0 (Reference  15).  The  next  inequality  plays  a central  role  in  the 
sequel. 

1151 

Jensen's  Inequality.^  ' Let  F(z)  e H^.  Then,  if  F(0)  is  nonsingular, 

3^  r inldet  F(e'-’®)|  d0  > In  | det  F(0)|  = ln|det  F„  | > -oc  (26) 

c.'n  — V 

-TT 

with  equality  iff  det  F(z)  is  devoid  of  zeros  in  jz]  <1.  That  is,  iff  F(z)  is  minimum - 
phase. 

Lastly,  let  {A^^}  be  a sequence  of  nxn  hermitian  nonnegative -definite  matrices 
which  is  either  monotonically  nondecreasing  or  monotonically  nonincreasing  and  bounded 
above  or  below,  respectively,  by  the  nxn  hermitian  nonnegative -definite  matrix  H. 

Then,  element  by  element, 

A = limit  A. 
i ->oc 

exists  as  an  nxn  hermitian  nonnegative -definite  matrix  and  correspondingly, 

A.  < A < H 
1 — — 

or 

Ai  > A > H , 

i = 1 -+  oc.  The  proof  is  carried  out  by  perceiving  that  A.  < A.  H,  i = 1 ^ oc,  is  equi- 

^ 1 It  1 

valent  to  asserting  that  the  sequence  [a  A.a}  is  monotone  nondecreasing  and  bounded 
above  by  a Ha  for  every  choice  of  n-vector  a. 

B.  The  Main  Results 

Let  the  nxn  matrix  K(0)  belong  to  L^  over  1=  [-it<0<^it1.  As  is  well  known , 
such  a matrix  possesses  a Fourier  series 

K(0)  - V A^e->^®  (27) 

r=  -or 

whose  nxn  matrix  coefficients 


1 


r 


II HI'..  I mil 
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A = f e"j^®K(0)de  0 

r 2ir  J n 


(2b) 


as  1 r I -»  00  (Reference  16).  Clearly,  if  K(6)  is  hermitian,  i.  e.  , if  K(0)  = K (0)  a . e . , 
. * 


A = A 
-r  r 


(29) 


for  all  r and  the  block  Toeplitz  matrices 


T = 
m 


"^0 

Ai 

• 

A 

m 

^.1 

Aq 

■ 

''^m-1 

(30) 

A 

-m 

^-m+l 

. 

^0 

of  respective  sizes  (m+l)nx(m+l)n,  m = 0 oo,  are  obviously  hermitian.  Furthermore, 
if  a.e.  K(0)  = K(-0)  = real  matrix,  all  A^  are  real  symmetric  matrices  and  A_^  = A^. 

Lemma  1.  Suppose  K(6)  e Lj^ . Then,  K(0)  is  hermitian  nonnegative -definite,  i.e., 
K(0)  = K (0)  > 0 a.  e.  , 


(1)  iff  every  is  nonnegative -definite,  m = 0-»  oo. 

(2)  Subject  to  the  additional  constraint 

. IT 

^ r In  det  K(0)d6  > -oo  , 


(31) 


all  T are  positive -definite,  m = 0-»  oc. 
m ^ 

Proof.  Let  c„,  c c denote  an  arbitrary  choice  of  m+1  constant  n-dimen- 

0 1m  ^ 

sional  column-vectors  and  put 

f(e) 


m . . 

V ^ 


r=0 


c e 
r 


(32) 


Under  the  assumption  that  K(0)  is  hermitian  nonnegative -definite. 


0 < ^ f*(0)K(0)f(0)d0 


(33) 


IT) 

. ^ 
^ f e 

V 

r , k=0 

-IT 

m 

V 

c,  A , < 

r,k=0 

k r-k 

c , K(0)c  d0 
k r 


(34) 


(35) 
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where 


(36) 


Hence,  since  c is  arbitrary,  T^  is  nonnegative -definite,  m = 0 oo,  Q . E.  D, 

Assume  conversely  that  all  T are  nonnegative -definite  and  pick  any  constant 
' m 

n-vector  x,  any  p,  0 < p < 1,  and  any  Select  the  so  that 


f{0)  = x(l- 


2.1/2 
P ) 


m -jr(9-6  ) 
V r ' o' 
P e 

r=0 


(37) 


1.  e.  , 


2,1/2  . 

= x(l  - p ) ' p e 


r = 0 m 


(38) 


Now  T nonnegative -definite  implies  that 


0 < c*T^c  = ~ /’'f*(0)K(0)f(e)de  = ^ / (1  - P 


m -ir(0-0  ) 
y r ' o' 


P e 


r=0 


X K(0)xd0 
(39) 


Letting  m -»  oc  in  Eq.  (39)  and  using  dominated  convergence, 

2 

^ !— £ ^ x ‘ K(0)xde  > 0 . (40) 

^ -IT  1 - 2p  cos(0-0^)  + p 

The  left-hand  side  of  E.q.  (40)  is  Poisson's  integral  and  as  is  well  known,^^  its  limit  as 
p -»1  - 0 is  equal  to  x''k(9^)x  for  almost  all  0^.  Thus,  a.  e.  , 

x*K(0)x>O  . (40a) 

The  exceptional  set  can  depend  on  the  choice  of  x.  However,  since  a countable  union 
of  sets  of  measure  zero  is  a set  of  measure  zero,  Eq.  (40a)  certainly  holds  a.e.  for 
all  rational  x.  But  the  latter  are  dense  in  the  space  of  all  n -vectors  x and  therefore 
K{0)  > 0,  a.e.  . Q.  E.  D. 

Lastly,  suppose  some  T is  singular.  Then  there  exists  a nontrivial  constant 
vector  c of  the  form  (36)  such  that  T^c  = 0.  Consequently,  if 
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[ 

I 


I.; 


f(0)  = 


r=0 


e 


r 


-jre 


is  constructed  with  this  c,  Eq.  (33)  to  Eq.  (35)  show  that 

0 = ^ j’"  f*(e)K(0)f(0)d9  ; 

-IT 

or,  since  K(0)  > 0,  a.e.  , 

K(0)f(0)  = 0 a.e. 

But  £(0)  / 0 a.  e.  whence. 


det  K(0)  = 0 
a.e.  and 


^ f In  detK(0)d0  = -«>  , 

CTF 

-IT 

contradicting  Eq.  (31).  Assertion  (2)  now  follows  immediately  and  lemma  1 is  estab- 
lished, Q . E.  D. 

The  standard  matrix  Wiener -Hopf  factorization  problem  can  be  phrased  as  follows. 
Determine  the  necessary  and  sufficient  conditions  for  the  existence  of  an  nxn  analytic 
matrix  fvnction 


B(z)  = y,  B z^ 
r=0 


(41) 


of  the  class  such  that 

det  B(z)  / 0,  |z|  < 1 {42) 

and 

K(0)  = B*(e*'’®)  B(e‘j®)  a.e.  (43) 

Of  course,  the  answer  is  well  known.^  ’ ^ Such  a "minimum -phase"  factor  is  unique 
up  to  multiplication  on  the  left  by  an  arbitrary  constant  nxn  unitary  matrix  and  exists 
iff 

(a)  K(0)  is  hermitian  nonnegative -definite,  belongs  to  and  satisfies  the 


r 
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1 

(b)  •= — j In  det  K(0)d0  > -oc 
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(44) 


The  necessity  of  (a)  is  obvious  and  that  of  (b)  is  the  subject  of  lemma  2. 

^2 


Lemma  If  B(z)  is  any  H^-factor  (41)  satisfying  Eqs.  (42)  and  (43), 


2 1 ' ^ 

<lnldetBQl  lndetK(e)d6 


(45) 


Proof.  Clearly,  det  B(z)  is  free  of  zeros  in  |z|  <1  and 
det  K(0)  = jdet  B(e'-’®)  | ^ . 


(46) 


To  reach  Eq.  (45)  observe  first  that  det  B(0)  = det  Bq  ^ 0 and  then  apply  the  equality 
part  of  Jensen's  inequality,  Q.  E.  D. 

Theorem  1.  The  nxn  matrix  coefficients  B , r = 0 ->  oc,  in  the  power  series 
expansion  (41)  of  the  minimum -phase  analytic  nxn  factor  B(z)  satisfying  Eqs.  (42) 
and  (43)  can  be  determined  by  means  of  the  following  two-step  algorithm. 

Step  1 : For  every  m > 0 effect  the  (unique)  Gauss  factorization 


T = L L 
m mm 


(47) 


where 


(m) 

00 


L = 
m 


L 


(m)  ^(m) 


10 


1 1 


(m)  (m) 


mO 


(m) 


ml 


(48) 


is  square  and  lower -triangular  with  positive  diagonal  scalar  entries.  (All  blocks  in 


The  inequality 
5 — f In  det  K(0)d0  < +or 

LTT  ^ 

•TT 


(44a) 


is  automatic^ *^’^^and  K(0)e  Lj  plus  the  regularity  condition  (44)  imply  In  det  K(0)e  L 
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(49) 


(50) 


Furthermore,  for  any  fixed  f > 0,  every  diagonal  scalar  sequence  {(L^^  )^  of  the 
matrix 


is  monotonically  nonincreasing  while  for  i ^ j and  all  A real,  r = 0->  oc,  {(L  . ).  .] 

X m z 1 » J 

is  the  difference  of  two  monotonically  nonincreasing  sequences. 

Proof.  A 1-sided  nxn  trigonometric  polynomial  P(0)  of  degree  < m has  the  form 


, (m) 
^00 

L if  = 
mi 

T (m) 
^10 

, (m) 
^11 

(51) 

, (m) 
_ 1 0 

r (m) 
1 

T (m) 

■ 

P(0)  = y X e 


-jr0 


(52) 


r=0 


where  the  X 's  are  constant  nxn  matrices,  r = 0 m,  and 
r 


X, 


X = 


(53) 


is  its  associated  n(m+l)xn  coefficient  matrix.  Evidently,  P(0)  = Q(e  ■'®)  is  the  boundary 
value  of 
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at  z = 


Suppose  Xq,  Xj X^  are  prescribed  in  advance,  jf  <m,  and  put 


X = . ; X. 

a b 


; X = 


X ” 
a 

A ■ 


For  given  m,  I and  X , let  ^ « (X  ) denote  the  collection  of  all  polynomials  (52) 

whose  coefficient  matrices  X have  the  specified  upper  part  X^.  Thus,  (X^)  is 

generated  by  assigning  the  lower  part  of  X arbitrarily  and  is  therefore  parametrized  by 

^b- 

The  problem  of  minimizing  the  quadratic  functional 

I(P)  = 1 r’'  Tr  {P*(e)K(0)P(e)]d0  (56) 

-TT 

over  P's  belonging  to  P , (X  ) is  a basic  one  and  its  solution  is  fundamental  to  our 
“ “ m , f a 

entire  approach.  Let 

U . (X  ) = minimum  -J—  I Tr  {P  (0)  K(0)  P(0)}  d0  . (57) 

m , i a 

Clearly,  for  k > m , Pe^^^^(X^)  implies  PePj^^^(X^)  and 

Consequently,  the  sequence  ^ (X^)  is  monotone  nonincreasing  with  respect  to  m and 

the  existence  of 

jX  (X  ) = Umit  (X^)  > 0 (59) 

m->oc 

is  assured  for  every  fixed  f > 0 and  X^. 

To  derive  an  explicit  expression  for  ^ (X^)  we  first  partition  L^  in  the  con- 
formable manner 
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»■ 


Let  X and  X ^ be  any  two  fixed  choices  of  X . From  the  convergence  of  the 
al  a^  a 

three  sequences 


we  infer  that  of  fRe  Tr(X  . F . X as  the  difference  of  two  monotone  nonincreasing 

' al  m,f  aZ  ^ 

sequences.  Similarly,  replacing  X^^  by  we  infer  that  of  {Im  i ^aZ^^ 

and  finally  that  of  {Tr(X*  , T .X  ,)]  as  the  difference  of  two  (possibly  complex)  se- 

quences  whose  respective  real  and  imaginary  parts  of  monotonically  nonincreasing.  In 

the  special  but  important  case  that  all  are  real,  r = 0 -*  -x-,  all  matrices  ^ are 

real  and  fTrlX*"  F . X .,)]  is  the  difference  of  two  monotone  nonincreasing  sequences 
^ a 1 m , jf  aZ 

for  every  real  pair  (X  , X ,) . 

3 1 3 c 

By  a suitable  choice  of  X , and  X Tr(X  , F - X ,1  can  be  made  equal  to  any 
^ al  3c  alm,fdc 

scal3r  entry  in  F „ and  therefore 
^ m , f 


F = limit  F^  ^ 

m->oc 


(69) 


exists  for  every  fixed  f > 0.  More  strongly,  every  diagonal  scalar  element  in  ^ 
approaches  its  limit  in  a monotone  nonincreasing  manner  and  if  all  A^  are  real,  every 
off-diagonal  element  converges  as  the  limit  of  the  difference  of  two  monotone  nonin- 
creasing sequences. 


Although  it  is  evident  that  the  limit  matrix  F^  is  hermitian  nonnegative  definite, 
it  is  not  at  all  obvious  that  it  is  actually  positive -definite . From  the  formula 


t 

V 

r=0 


det  L 


(m) 


rr 


(70) 


it  is  seen  that  det  F^  = 0 implies 


inf  det  L 
m 


(m) 
r r 


= 0 


(71) 


for  at  least  one  r,  0 < r < t . However,  by  examining  the  lower  right-hand  corner 
n(m+l)xn(m+l)  block  of  (use  Eqs.  (47)  and  (48)),  it  is  easily  seen  that 


, (m)  _ . (m+  1) 
^00  " 11 


and 


T (m)  _ . (m+  1) 

^10  " -^zi 


(7Z) 


In  general,  for  given  q > 0 and  r > k, 

(m)  _ , (m+q)  <73) 

^rk  " ^r+q,k+q 

for  m sufficiently  large.  Thus,  if  Eq.  (71)  is  satisfied  for  one  value  of  r it  is  satisfied 


J 
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for  every  r = 0 i and  det  = 0 implies 


inf  det  = 0 

m 


which,  as  we  will  show,  is  impossible. 

Consider  the  problem  of  minimizing  I(P)  over  & r,(l  ) and  let  P (0)  denote  the 

m > V n iTi  . Q 

corresponding  extremalizing  polynomial.  Note  that  X =X„=  1 and  P (0)  = Q (e'^°) 

a 0 n m'  m ' 


where 


Q (z)  = 1 + y X z^ 

m ' n -I  r 
r=l 


From  Eqs.  (64)  and  (67), 


‘-oo’  ■ = 27  /’  P*(8)K(e)p„(e)de  (76) 

- ‘IT 

and  applying  the  master  inequality  and  Jensen's  inequality  (section  A)  in  succession, 

lndetl^,0>^  J lndetK(0)d0  + ^ J In  | det  P^(0)  [ ^d0  > f lndetK(0)d0 


since  IndetQ^(O)  = lndetl^=  0.  Thus,  imposing  the  regularity  condition  (44), 

inf  det  ^ > exp  j In  det  K(0)  d^  > 0 . (78) 

. • . inf  det  L^™^  = inf  t/deFF^  >0  (79) 

m m ’ 

and  assumption  (74)  is  untenable,  Q.E.D. 

The  hermitian  positive -definite  character  of  the  limit  matrix  now  permits  an 
easy  proof  of  the  existence  of 

h = ^mf  • (80) 

m ->oc 


F^=A^A^  (81) 

where  is  the  unique  square,  lower -triangular  factor  with  positive  diagonal  elements 
and  define 


. % 


SYSTEMS,  CONTROL  AND  NETWORKS 


521 


-1 


(82) 


Observe  that  U is  also  square  and  lower-triangular  with  positive  diagonal  elements, 
mi 

From  Eq . (67), 


limit(U  . U J = 1 
' mi  mi 
m 


and  a straightforward  argument  taking  the  properties  of  into  account  yields 


limit  U . = 1 
...  mi 

m-»oc 


(83) 


(84) 


Or,  from  Eq.  (82), 


limit  L . = A. 

mi  i 
m 


(85) 


(m ) 

By  selecting  i large  enough  any  block  L'"  r > k,  can  be  encompassed  within  and 


therefore 


limit  L^f  ^ = B , 
rk  rk 

,T1->0C 


(86) 


exists , Q . E.  D. 

In  view  of  Eq.  (73), 

limit  = B _ = limit  = B u 

m^oc  r+q,k+q  r+q,k+q  rk  rk 


whence , 


B , = limit  ^ = B . , r > k . 
rk  _ rk  r-k  — 

m-^oc 


In  particular. 


limit  L^*r^  = B , r = 0 oc 
rO  r 

m ->oc 


Thus,  for  any  fixed  i > 0, 


limit  L 
m 


mi 


B. 


®i  ®i-l 


B. 


(87) 


(88) 


(89) 
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By  appealing  to  the  already  established  convergence  properties  of  {1^1  } and  a slight 
generalization  of  the  minimum  problem  we  can  readilv  demonstrate  that  every  diagonal 
scalar  sequence 

'i.i^ 

is  monotonically  nonincreasing  while  for  all  i ^ j and  all  real,  r = 0 °c, 

is  the  difference  of  two  monotonically  nonincreasing  sequences.  By  going  to  the  limit 
in  Eq.  (77)  we  derive  the  important  inequality  (actually  an  equality), 

9 1 ^ 


lnldetBQr>^/  lndetK(0)d0  . 


From  Eqs.  (47)  and  (48), 

^0  L ' rO  ' ^rO 
r=0 

which  implies  that 


for  every  fixed  i < m.  Letting  m -»  so, 


A-  - y B B > 0 
0 L r r — 
r=0 


which  assures  both  the  existence  of  the  infinite  svim 


^ b'  B 
r=0  ^ ^ 

and  the  inequality 


*0-  . 
r=0 


Instead  of  specifying  the  n x n blocks  Xq,  X j , . . . , X/  in  adv  am  e we  fix  the  fi  r t 
f +1  scalar  rows  (from  the  top)  of  the  coefficient  matrix  X. 


L 


SYSTEMS,  CONTROL  AND  NETWORKS 


523 


The  matrix  function 


B(z)  = V 
r=0 


(94) 


is  therefore  analytic  and  of  class  in  jz!  <1- 

Generalizing,  from  T = L L and  Eq.  (60)  we  obtain 
m mm 

* * 

T = L , L , + L L 
I mf  mf  ma  ma 

which,  when  subjected  to  the  same  limiting  process  as  above  gives 

T^  - (B^^  h'  ^ > 0 . 


(95) 


(96) 


Here 


M) 


B, 


B. 


(97) 


has  f+1  block  columns  and  an  infinite  number  of  block  rows.  But 

is  nothing  more  than  the  corresponding  Toeplitz  matrix  (30)  generated  by  the 
Fourier  coefficients  of  B'''(e'-’®)  B(e'^®)  and  invoking  lemma  1,  Eq.  (96)  yields 


K(9)  - B^le'j®)  B(e‘-’®)  > 0 a.e. 

Thus,  from  Eq.  (19), 

In  detK(0)  > In  j det  B(e'-’®)1^ 
a.e.  and  Jensen's  inequality  plus  Eq.  (90)  give 


(98) 


(99) 


J-  ) " lndetK(9)d0  / " In  | det  B(e  ■j®)^d0  > In  | det  B^  [ S ^ j lndetK(0) 


I d0 


(100) 
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Consequently,  all  four  terms  in  Eq.  (100)  are  equal.  From  the  equality  of  the  inner 
two  we  deduce  (Jensen)  that  det  B(z)  is  free  of  zeros  in  jz]  <1  and  from  the  equality 
of  the  two  on  the  left  and  Eq.  (99), 

detK(0)  = ldetB(e'^®)  1 ^ , a.e.  (101) 

However,  because  of  Eq.  (98),  the  a.e.  nonsingularity  of  K(0)  and  the  hermitian  non- 
negative-definite  character  of  both  K(0)  and  B ^(e  ”^®)  B(e *^®) , Eq  . (101)  is  possible  iff 

K(0)  = B*(e'j®)B(e'j®)  a.e.  (102) 

The  proof  of  theorem  1 is  complete,  Q.E.D. 

C.  Matrix  Polynomials  Orthogonal  on  the  Unit  Circle 

Before  introducing  these  polynomials  we  shall  establish  an  important  property  of 

the  Q^(z).  Recall,  that  P^(6)>  the  unique  nxn  trigonometric  polynomial  minimizing 

I(P)  over  9 ) equals  Q (e  ■^®)  where 

m,0  n ^ m' 

m 

Q (z)  = 1 + V X z^  , 

m n — ' r 

r=  1 

m = 0 oc  and  the  X are  chosen  in  accordance  with  Eq.  (68).  It  is  convenient  to  refer 
th  ^ 

to  Q^(z)  as  the  m generating  polynomial  over  9^  O^^n^' 

Lemma  3.  The  matrix  polynomial  Q^(z)  is  minimum -phase ; i.e.  , 

detQ^(z)  ^ 0,  Izl  < 1 , (103) 

m = 0 -»oc. 

Proof.  Suppose  that  detQ  (z  ) = 0,  Iz  1 < 1.  Then,  since  Q (0)  = 1 , z 4 0 
m o ' o'  m no' 

and  for  some  n-vector  x satisfying  x*x  = 1, 

Q (z  )x  = 0 . (104) 

m o ' ' 

An  easy  check  shows  that  the  matrix 


A(z)  = 1 


1 


n 1 + z 


1 - zz* 
o 


is  regular  paraconjugate  unitary;  i.e.,  A(z)  is  analytic  in  jz]  < 1 and 


(For  z = e , .A(e  ) is  unitary.  ) Noting  that 
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1 + z* 

det  MO)  = z^  • yTT  ^ ° ’ 


(107) 


it  is  obvious  that 


.-1, 


Aj^(z)=  A'‘(0)A(z) 

is  analytic  in  jzj  <1  and  normalized  to  1^  at  z = 0. 
Let 

Q(z)  = Q^(z) 


(10b) 


1 - z. 


-Qj^(z)dn-  i + z>;= 


1 + z 
z - z 


XX*)  A(0) 


(109) 


Clearly,  0(0)  = 1 and  the  residue  matrix  at  the  only  possible  finite  pole  z=  z^  is  pro- 
portional to  Q^(z")xx*  = 0.  Thus  Q(z)  is  an  nxn  polynomial  matrix  in  z of  degree  < m 

and  P(0)  = Q(e'^®)e  ^rn,0^^n^-  correlated  cost 


i(P)  = 1 J Tr  [p*(e)K(e)P(0)]de 


= ^ / Tr  {A‘'(0)  A(e 


■j®)p'' (0)K(0)  P (0)  /.""(e'^®)  A(O);id0 


<4_  f’^Tr  {A(e"''®)P*(0)K(0)P  (6)  A'(e*j®)]d0 


= 4-  f'^Tr  (P*  (0)K(0)P^(0)]d0 
Ztt  ID  rn 

-TT 

(In  going  from  line  2 to  the  inequality  in  line  3 we  have  made  use  of  the  fact  that  .'.(0)  A (0) 

possesses  only  the  two  distinct  eigenvalues  1 and  1zq1^<  M Hence,  I(P)  = I(P^)  and 

by  uniqueness,  P(0)  = which,  in  view  of  Eq.  (109)  is  absurd  unless  x is  the  zero 

vector,  a contradiction,  Q.E.D. 

Evidently,  because  of  the  extremalizing  property  of  Pj^(O),  lor  every  m > 0 

P(0)  = ej*"®P  (0) 


(110) 
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is  the  unique  trigonometric  polynomial  of  the  form 


1 . 

n r 

r=0 


(111) 


minimizing  I(P)  and  is  realized  as  the  boundary  value  at  z = e'^®  of  the  monic  polynomial 


m - 1 


^ z’^Q  (i)=z"’-  1 + y . 


n m r 

r=0 


(112) 


= X , r = 0 m - 1 . 

r m -r 


{m) 


More  important  is  the  observation  that  the  sequence  {Q^™^(z)}  is  matrix  left-orthogonal 
for  K(0);  i.  e.  , 


< Q^^'^z),  Q^‘"'(z))h^  j’'Qj,''heJ®)K(9)Q^‘"'{ej®)de  = O^  , 


(114) 


r ^ k = 0 


Proof.  From  the  identification  (112)  and  Eqs.  (63)  + (68)  we  obtain  the 
formula* 


L Q - L^™^  + + + L^°^ 

m“m  " -^00  + So  + • • • + 


(115) 


where 


T‘n 


a = 

m 


Q(m)  j 

m - 1 n 


Q(m)  Q(m-l) 

m-2  '^m-2 


„(m)  Q(m-l) 
0 0 


. 1 


(115) 


m = 0 w.  Thus,  all  the  off-diagonal  nx  n blocks  of 


K(0)  is  assumed  to  possess  all  the  properties  enumerated  in  theorem  1. 

sit  sit 

A + B is  the  "direct"  sum  of  matrices  A and  B. 
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sje  ^ 

Q (L  L )u  = n T n 

m m m m m m m 


(117) 


equal  O^  which  is  precisely  what  is  required  for  the  validity  of  Eq.  (114),  Q . E.  D. 
According  to  Eq.  (115), 


n 

(m) 

m - 1 

! - <^m>;l  • Ho 


(118) 


^(m) 


(L'^)  j denoting  the  first  block  column  of  the  inverse  of  . Hence,  Eq.  (118)  can  be 


used  to  generate  the  coefficients  of  all  the  Q^’^\z),  m = 0 -»  oc. 
In  the  same  way,  the  nxn  matrix  polynomials 
m - 1 


r("^>(z)  = z”"  • 1 + V , 


[119) 


r=0 


m = 0 ->  oc,  are  said  to  be  right-orthogonal  for  K(9)  if 

(R^^'H),  R^*"’(z)  > = R^''H-'®)K(e)R.J,''\e-'®)de=  O^  , 


[120) 


r k = 0 ^ 

» » / 

It  follows  immediately  that  the  sequence  {R'”’'(z)}  is  left-orthogonal  for  K'(9)  and  in 
particular,  if  K(9t  is  symmetric, 


m = 0 <=c  . 


(121) 


In  general  if 


T ^ 

m 


-1 


-m+  1 


A A , 
m m - 1 


= M M 
m m 


[122) 


sd 


I'.* 
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M 


M 


(m) 


00 


M 


(m)  j^(m 


10 


1 1 


M 


(m)  j^(m) 


mO 


ml 


M 


(m) 


(123) 


chosen  as  the  unique  lower -triangular  factor  with  positive  diagonal  scalar  elements, 
'l 


m - 1 


R 


(m)* 


m - 2 


' R 


(m)* 


m ) 1 


M 


(m) 


00 


m = 0 -»  00 


(124) 


When  all  A^'s  are  real  and  symmetric  (which  is  true  if  K(0)  is  real  and  even),  the  con- 


cepts of  left  and  right -orthogonal  on  the  unit  circle  simplify  and 


R^*^^z)  = (z)  , m = 0 00 


(125) 


Equating  the  corresponding  (i-H)nx(f  +l)n  upper  left-hand  corner  blocks  on  both 
sides  of  Eq.  (115)  yields 


“'mf 


o - I (ni)  i T ("’-I)  + 

mf  ■ -^00  + So  + 


-1  L 


(m-f 


00 


(126) 


where 


n 

,(m) 


m - 1 


n 


mf 


m -2 


n 

,(m-l) 


m “2 


I , 


,(m) 


.(m-1) 


m-f 


(127) 


J 


r 
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Let  m — jc  in  Eq.  (126)  with  i held  fixed  and  define 


n.  = limit  D . 
e mf 

m — 


(128) 


Then,  using  Eq . (89), 


A.  • SL 

e I 


Bq  + Bq  + . . . + Bq 


(129) 


and  we  have  derived  the  remarkable  result 


’’o'"! 


-I-l 


S!.  = I«o‘b2 


Equivalently,  for  i = 0 oc  all  limits 

C=  limit 
i „ m -i 

m ->oc 


(130) 


(131) 


exist  and  are  given  explicitly  by  the  corresponding  entries  in  the  first  block  column  of 
the  inverse  matrix  appearing  in  Equation  (130).  Furthermore,  as  should  be  clear  from 
Eq . (129),  the  power  series 


oc 

C(z)  = 1„  + ^ C_  z' 

r^l 


,-l. 


provides  the  inverse  of  the  normalized  canonic  factor  B^  B(z); 


(132) 


(1  + B;*B,z  + B'^B,z^+  )(1  + C,z+ C,z^+  ...  + ..)=  1 

n0102  nl2  n 


(133) 


Thus,  the  limiting  values  (131)  of  the  coefficients  of  the  left-orthogonal  polynomials 

(m ) “1 

Q'  '(z)  can  be  employed  to  construct  the  normalized  canonic  factor  B^  B(z)  or  vice- 

versal  For  example. 


(134) 


530 


SYSTEMS,  CONTROL  AND  NETWORKS 


(134a) 


etc.  , involving  the  inversion  of  only  one  nxn  matrix 

Bq  = limit  Lqq 
m 

D.  Recursion  Formulas  for  the  Orthogonal  Polynomials 

As  is  well  known, in  the  scalar  case  (n  = 1)  the  are  tied  together  by  a 

recursion  of  the  form 


= zQ<™>(z)  - a 


[135) 


m = 0 oc.  However,  for  n > 1 the  situation  is  complicated  by  the  existence  of  distinct 
left  and  right-orthogonal  sets. 

Theorem  2.  Let  K(0)  satisfy  the  conditions  of  theorem  1.  Then, 

(1)  the  left  and  right -orthogonal  nxn  matrix  polynomials  associated  with  K(0) 
are  generated  by  the  recurrences 


Q(m  + 

H)  - 

m 

z 

rJ-) 

(z)  • . 

m 

(136) 

R(m+ 

= zR^^H)  - 

m 

z 

Pm  ■ 

Qf’l-)  , 

(137) 

where 

» = 

R<°H)  ^ 

= 1 and 
n 

a = 
m 

1^00 

V‘  ■ 

(R(m 

^K).i 

= ■ 
1 00  00  ' 

(KQ^*^)).!  . 

(138) 

= 

m 

(R*"'H)_j  • 

<4” 

)*  r (m 
^00 

- (KQ  )_j  (Lqq 

. (m)  -1 
^00  ' 

(139) 

m = 0 oc 

In  Eqs.  (138)  and  (139),  (R^™^K)j^  and  (KQ^™^)^^  denote  the  constant  nxn  matrix  coef- 
ficients of  e'^^®  in  the  Fourier  expansions  of  R'^(e'^®)  K(0)  and  K(0)  ^e'^®) , respective- 

ly. Hence,  if  m = 0 


(R^°>K)_j  = (K)  = A*  = (K)_j  = (KQ^°h_j 


^00  ^00  "^0  “ Ho  ^00  ' 


(140) 

(141) 

(142) 


(2)  For  each  m = 0 -*  and  are  the  unique  nxn  trigonometric 

polynomials  of  the  form 
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P(0)=  1 . ^ X 


(143) 


r=  1 


minimizing  the  respective  quadratic  functionals 
Iq(P)  = ^ /’^Tr  {P=:--(9)K(0)P(9)]de 

-TT 

and 


(144) 


Ir(P)  = ^ / Tr  {P(0)K(0)  P=:=(9)]d0  . 


(145) 


(3)  Form=0->oc', 

detQ^™^(z)  • detR^'^^(z)  f 0 , jz]  > 1 


(146) 


Proof.  Only  part  (1)  reqioires  verification.  Since 


z^l  -Q<’^>(z)  and  z’^  1 - R<"">(z) 

n n 


are  both  of  degree  < m-1,  it  is  clear  that 


z"^  1 - Q^’^\z)  m-1  , . , , 


m - 1 


(147) 


r=0 


and 


z"^l  -R^^\z)  m-1  , ... 

^ ■-  — = V ^){z) 

m-1  r V ' ' 

z r=0 


(148) 


for  some  choice  of  n y.  n constant  matrices  r=0-^m-l,  m = 1 oc.  Now 

r r 


from  the  very  definitioi.  of  the  orthogonal  polynomials  it  follows  that 

k = 0 ->  m - 1 , 


(R^""^K)^  = O^ 


_ r (m)*  j (m)  , (m)  . _ ^(1")*  w(m) 

(KQ  )^  - Lqq  Lqq  . (R  K)^  - Mqq  Mq^ 


(149) 

(150) 


Thus 


(R^'^'kQ^'^^)  , = =0  ,r=0-^m-l  . 

' 'm  - 1 ' m-1  n 


(151) 


Setting  z = e'^®,  multiplying  both  sides  of  Eq.  (147)  on  the  left  by  R^^^e'^®)  K(0)  and  then 
using  the  orthogonality  relations  (120)  plus  (150)  and  (151)  gives 
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Similarly,  from  Eq.  (148), 


_ 
r 


'-1 


' 00  00' 


(152) 


(153) 


and  both  and  are  independent  of  m,  r = 0 -»  oc ; i . e . , and  p^"^^  = p^. 


Equations  (147)  and  (148)  now  read 
m-1 


,_(m),  . m, 

Q (z)  = z 1 - z 

n 


j,(m),  . m,  m-1 
R (z)  = z 1 - z 

n 


m-1  , . 

V R<'^\z)a^ 
r=0 

m-1  , . 

r=0 


;i54) 


(155) 


and  Eqs.  (136),  (137)  drop  out  immediately. 

Lastly,  multiplying  both  sides  of  Eq . (137)  on  the  right  by  K(0)  and  integrating 


I = [-IT  < 9 < Tr]  we  get 


, (m)-  (m)  -1 

^^00  ^00  ' 


establishing  Eqs.  (138)  and  (139)  Q.  E.  D. 
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D.  C.  Youla  and  N . N . Kazanjian 


REFERENCES 


1 . 


D.C.  Youla,  "On  the  Factorization  of  Rational  Matrices,"  IRE  Trans.  Information 
Theory,  IT-7,  No.  3,  172-189  (July  1961). 

2.  E.  Wong  and  J.B.  Thomas,  "On  the  Multidimensional  Prediction  and  Filtering 
Problem  and  the  Factorization  of  Spectral  Matrices,  " J.  Franklin  Inst.  , 272 , No.  2, 
87-99  (August  1961). 

3.  A.C.  Riddle  and  B . D.  Anderson,  "Spectral  Factorization  - Computational  Aspects , " 
IEEE  Trans.  Automatic  Control,  AC -1  1,  764-765  (October  1966). 

4.  F.  Csaki  and  P.  Fischer,  "On  the  Spectrum  Factorization,"  Acta  Tech.  (Budapest), 

145-168  (1967). 

5.  M.G.  Strintzis,  "A  Solution  to  the  Matrix  Factorization  Problem,  " IEEE  Trans. 
Information  Theory,  IT  - 18,  225-232  (March  1972). 

6.  B.D.O.  Anderson,  "An  Algebraic  Solution  to  the  Spectral  Factorization  Problem," 
IEEE  Trans.  Automatic  Gontrol,  AC  - 1 2,  410-414  (August  1967). 

7.  W.G.  Tuel,  "Computer  Algorithm  for  Spectral  Factorization  of  Rational  Matrices,  " 
IBM  J.  Res.  Develop.  , 163-170  (March  1968). 

8.  G.  Tunnicliffe  Wilson,  "The  Factorization  of  Matricial  Spectral  Densities,  " SIAM 
J.  Appl.  Math.,  No.  4,  420-426  (December  1972). 


SYSTEMS,  CONTROL  AND  NETWORKS 


533 


9.  F.L.  Bauer,  "Ein  direktes  Iterationsverfahren  zur  Hurwitz -Zerlegung  eines 
Polynoms,"  Archiv  der  Elektrischen  Ubertragung,  9,  285-290  (1955). 

10.  J.  Rissanen  and  T.  Kailath,  "Partial  Realization  of  Random  Systems,"  in  5thIFAC 
World  Congress,  (paper  35.6),  Paris  (1972). 

11.  J.  von  Neumann,  "The  Geometry  of  Orthogonal  Spaces,  " n (Princeton  University 
Press,  1950). 

12.  L.  Mirsky,  "Linear  Algebra,"  (Clarendon  Press,  1963). 

13.  N.  Wiener  and  P.  Masani,  "The  Prediction  Theory  of  Multivariate  Processes," 

Part  I:  "The  Regularity  Condition,  " Acta  Math.  , ^8^  (1957);  Part  II:  "The  Linear 

Predictor,"  Acta  Math.,  99  (1958). 

14.  G.H.  Hardy,  J.E.  Littlewood  and  G . Polya,  "Inequalities,"  (Cambridge  University 
Press,  1952). 

15.  R.P.  Boas,  "Entire  Functions,"  (New  York:  Academic  Press,  1954). 

16.  A.  Zygmund,  "Trigonometr  ic  Series,  " 2nd  Ed.  , (Cambridge  University  Press , 1959). 

17.  U.  Grenander  and  G.  Szego,  "Toeplitz  Forms  and  Their  Applications,"  (University 
of  California  Press,  1958). 


534 


SYSTEMS,  CONTROL  AND  NETWORKS 


ON  THE  DESIGN  OF  SINGLE- LOOP  SINGLE-INPUT -OUTPUT  FEEDBACK  CONTROL 
SYSTEMS  IN  THE  COMPLEX-FREQUENCY  DOMAIN 

J.J.  Bongiorno,  Jr.  and  D.  C.  Youla 


This  summary  is  concerned  with  the  single-loop,  single-input-output,  feedback 
control  system  shown  in  Figure  1.  The  subsystems  are  modeled  by  rational  transfer 
functions  and  the  transfer  functions  F(s),  F^(s),  P(s)  and  Pq(s)  are  given.  Attention  is 
restricted  to  the  large  class  of  practical  cases  for  which  P(s)  is  proper  and 
F(s)P(s)C(s)  ^ 0.  The  objective  is  the  determination  of  the  controller  transfer  func- 
tion C(s)  for  which  the  system  is  asymptotically  stable  and  design  objectives  are  met. 
This  has  always  been  the  objective  in  feedback  system  design.  What  is  novel  about  the 
approach  reported  on  here  is  that  the  family  of  all  stabilizing  controller  transfer  func- 
tions for  the  given  F(s)  and  P(s)  is  identified  and  utilized.  Specifically,  every  control- 

2 

ler  which  stabilizes  the  loop  is  of  the  form 


C(s) 


Y(s)  + A(s)K(s) 
X(s)  - B(s)K{s) 


(1) 


where  the  polynomials  A(s),  B(s),  X(s)  and  Y(s)  are  determined  from  the  product 

❖ 

F(s)P(s)  --  the  given  data  --  and  K(s)  is  any  rational  function  free  of  finite  poles 
(analytic)  in  Re  s > 0.  Moreover,  all  the  freedom  that  exists  in  selecting  the  closed- 
loop  characteristic  polynomial  resides  in  the  choice  of  denominator  for  K(s).  The  se- 
lection of  this  denominator  is  the  first  step  in  the  design  method  described  here  and  is 
done  so  that  asymptotic  stability  as  well  as  acceptable  modal  responses  are  assured. 
Hence,  the  struggle  over  stability  everpresent  in  all  classical  design  schemes  is  cur- 
cumventedl  The  remaining  design  freedom  lies  in  the  selection  of  the  numerator  for 
K(s). 

All  the  available  design  freedom  (i.  e.  , the  selection  of  a rational  K(s)  analytic  in 
Re  s > 0)  is  utilized  in  Refs.  1 and  2 to  obtain  controllers  which  are  optimal  with  respect 
to  a performance  index  selected  on  the  basis  of  sound  engineering  considerations.  The 
optimal  approach,  however,  requires  that  the  system  input  signals  be  characterized  by 
■ appropriate  power  spectral  densities.  These  data  are  often  not  available  or  too  costly 
to  obtain.  This  justifies  the  attention  which  has  been  directed  toward  the  selection  of 
a suboptimal  controller. 


5|C 

A transfer  function  T(s)  is  proper  if  T(oo)  is  finite  and  strictly  proper  if  T(oc)  = 0. 
Otherwise,  it  is  improper. 

If  F(s)P(s)C(s)  = 0 there  is  no  benefit  to  be  gained  from  using  feedback  and  the 
entire  discussion  makes  no  sense. 

Only  the  case  X - BK  s 0 is  excluded. 


The  suboptimal  design  procedure  which  has  been  developed  produces  a controller 

* 

transfer  function  which  insures  that  the  loop  sensitivity  function 

S(s)  = (1  + FPC)"^  (2) 

satisfies 

1 - S(s)  = , jsl  00  (3) 

where  p is  a selected  positive  integer. 

The  value  of  p chosen  depends  on  engineering  considerations  and  design  objectives 
Suppose,  for  example,  that  the  restriction  C(s)  proper  is  imposed.  One  easily  obtains 
from  Eq.  (2)  that 

C(s)  = |^  . (4) 

When 


Function  arguments  are  omitted  wherever  convenient, 

q q 

T(s)  = O(s^)  means  that  T(s)  behaves  like  s for  the  indicated  range  of  s. 
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FP=0(s‘‘')  , 


it  immediately  follows  that  C(s)  is  proper  if,  only  if,  p > v.  Alternatively,  in  order  to 
guard  against  saturation  owing  to  the  large  bandwidth  typically  possessed  by  the  noise 
m^,  one  might  impose  the  restriction  that 

R(s)  = C(s)S(s)  = (6) 

be  strictly  proper;  the  transfer  function  from  m^  to  r is  -R(s).  In  this  case,  p > v+1 
is  necessary  and  sufficient. 

There  are  cogent  reasons  for  insisting  that  p > 1 in  all  cases.  Then, 

S(s)  1 , I s I 00 

or,  equivalently, 

FPC  0 , I s 1 00  . (8) 

One  important  attribute  associated  with  this  behavior  is  that  the  sensitivity  of  loop 
stability  to  high-frequency  modeling  errors  in  P(s)  and  F{s)  is  eliminated.  That  this  is 
the  case  is  readily  appreciated  with  the  aid  of  the  Nyquist  criterion. 

Another  factor  which  enters  into  the  selection  of  p is  the  degree  of  C(s).  It  has 
been  established  that  the  larger  p,  the  higher  the  degree  of  C(s).  The  computations  re- 
quired are  least  when  the  smallest  p is  selected.  In  the  past,  one  could  also  appeal  to 
practical  considerations  to  motivate  the  choice  of  C(s)  of  least  possible  degree.  Today, 
however,  one  must  recognize  "•  • • the  revolutionary  impact  of  the  integrated  circuit 
operational  amplifier  on  the  practical  realizability  of  compensating  transfer  functions. 
Poles  and  zeros  may  now  be  sprinkled  like  salt  and  pepper  over  the  s -plane  - • • . " 

The  design  procedure  described  here  permits  the  accommodation  of  steady -state 
error  specifications.  It  also  allows  for  including  additional  design  parameters  in  the 
numerator  polynomial  of  K(s)  which  can  be  selected  to  improve  system  performance. 

For  example,  design  parameters  can  be  introduced  to  help  shape  the  real-frequency 
amplitude  characteristic  of  the  sensitivity  function  or  to  minimize  the  integral  square 
error.  As  already  pointed  out,  the  characteristic  polynomial  of  the  system  is  selected 
in  advance  by  the  designer  and,  therefore,  the  system  modes  are  always  under  his  con- 
trol. For  these  reasons,  it  is  believed  that  the  design  procedure  presented  here  affords 
some  of  the  needed  flexibility  to  manage  the  filter  and  feedback  problems  and  partially 
answers  the  criticisms  raised  in  Ref.  3 concerning  the  design  of  single-degree-of- 
freedom  feedback  systems. 


r 
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A.  The  Design  Procedure 

The  polynomials  A(s),  B(s),  X(s)  and  Y(s)  are  obtained  from  F(s)  and  P(s).  Spe- 
cifically, 

F(s)P(s)  = . (9) 

where  A(s)  and  B(s)  are  relatively  prime.  It  immediately  follows  that  there  exist  poly- 
nomials X(s),  Y(s)  so  that 

AX  + BY  s I . (10) 

Let 

K(.)  = ^ . (11) 

It  can  be  shown  that  the  characteristic  polynomial  for  the  closed  loop  is  given  by 

t q(s)  = h^hph^  (^^)L  , (12) 

where  h^,  h , h^  are  polynomials  which  account  for  the  "hidden  modes"  in  the  feedback 
sensor,  plant,  and  controller,  respectively.  A^  and  A^  are  the  denominator  polynomials 
j in  the  feedback  sensor  and  plant  transfer  functions,  respectively.  Clearly,  it  is  nec- 

I essary  that 

I A A 

4>(s)  - h^hph^(^-£)  (13) 

t be  a strict  Hurwitz  polynomial  for  asymptotic  stability  and  this  is  assumed  to  be  the 

case.  Also,  the  controller  transfer  function  can  always  be  obtained  with  a minimal- 
order  realization  so  that  h^(s)  = 1.  It  is  then  apparent  from  Eqs.  (12)  and  (13)  that  all 
freedom  in  the  selection  of  q(s)  resides  in  the  choice  of  L(s).  There  are  no  hard  and 
fast  rules  that  can  be  stated  for  the  selection  of  L(s).  It  is  at  this  step  in  the  design 
that  the  art  of  engineering  expresses  itself.  Of  course,  L(s)  strict  Hurwitz  is  necessary 
and  sufficient  for  asymptotic  stability.  Also,  as  established  in  the  sequel,  the  degree 
of  L(s)  is  fixed  by  design  objectives. 

Since  Eq.  (3)  is  to  be  imposed,  it  is  clear  that  a useful  formula  for  1 - S is  needed. 

[ One  easily  finds  that 

‘ . (14) 

j , 

I 

5j« 

V One  which  is  free  of  zeros  in  the  half-plane  Re  s > 0. 

Iv 
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It  is  now  necessary  to  distinguish  between  two  cases  in  order  to  proceed. 

Case  1:  FP  = constant 

In  this  case,  A(s)  and  B(s)  are  constants  and  in  Eq.  (10)  one  can  choose 

Y - 0 , X = A’^  . (15) 

Equation  (14),  then,  reduces  to 

1 - S = BA(^)  , (16) 

where  the  product  BA  is  equal  to  a nonzero  constant.  Now  Eq.  (3)  is  satisfied  provided 

'fi 

M and  L are  chosen  so  that 

6(L)  = 6(M)  + p . (17) 

As  pointed  out  previously,  the  design  freedom  is  incorporated  into  the  coefficients  of 
M(s).  There  are  6(M)  + 1 of  these  and  this  number  is  selected  by  the  designer.  Then 
6(L)  is  fixed  by  Eq.  (17)  and  the  zeros  of  L(s)  are  picked  so  that  the  closed-loop  system 
modes  are  acceptable.  Finally,  the  coefficients  of  M(s)  are  determined  so  that  desired 
system  performance  objectives  are  realized  (e.g.,  a steady-state  error  specification). 

The  case  described  here  is  not  likely  in  most  practical  circumstances.  In  fact, 
when  the  plant  is  nonminimum  phase  and/or  unstable,  there  is  no  admissible  pair'^'^  for 
which  FP  is  identically  constant.  Even  in  the  case  of  minimum -phase  asymptotically 
stable  plants,  it  is  not  likely  that  the  feedback  sensor  will  cancel  all  the  poles  and  zeros 
of  the  plant.  For  these  reasons,  the  emphasis  in  the  sequel  is  on  the  second  case. 

Case  2:  FP  # constant 

Either  A(s)  or  B(s)  or  both  are  not  constant  in  this  case,  and  it  follows  that  Eq. 

(7)  is  satisfied  only  if 

6(YL)  = 6(AM)  . (18) 

For  suppose  the  contrary  is  true.  Then, 

6(YL  + AM)  > 6(YL)  > 6(L)  , (19) 

and  it  immediately  follows  from  Eq.  (14)  that 

1 - S = 0(s'^)  , ls|  or  (20) 


For  any  polynomial  lii(s),  6(4>)  denotes  its  degree. 

An  admissible  pair  is  one  for  which  <t>(s)  in  Eq.  (13)  is  strictly  Kurwitz. 
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where 


q > 6(BY)  = 6(B)  + 6(Y) 

(21) 

Now  Eq.  (10)  requires  that 

6(AX)  = 6(BY) 

(22) 

Hence,  it  is  also  true  that 

q > 6(A)  + 6(X) 

(23) 

Since  6(A)  > 1 and/or  6(B)  > 1 , it  immediately  follows  that 
isfied. 

q > 1 and  Eq.  (7)  is  not  sat 

Equation  (16)  is  necessary,  but  not  sufficient  for  the 
From  Eq.  (14),  it  is  clear  that 

satisfaction  of  Equation  (3) 

6(L)  - 6(B)  - 6(YL  + AM)  = fx 

(24) 

is  required.  In  light  of  Eqs.  (18),  (24),  and  /j  > 1 , 

6 = 6(YL  + AM)  < 6(YL)  ? n 

r 

(25) 

is  necessary.  Otherwise,  the  left-hand  side  of  Eq.  (24)  is 

6(L)  - 6(B)  - 6(YL)  = -6(B)  - 6(Y)  < 0 

(26) 

and  /i  > 1 is  impossible.  Substituting  Eq,  (25)  into  Eq.  (24)  yields 

6^  = 6(L)  - 6(B)  - 

(27) 

as  the  required  value  for  the  reduced  degree  of  YL  + AM.  It  is  later  verified  that  the 
designer  can  always  guarantee  6(L)  > 6(B)  + (.t  so  that  6^  > 0 as  it  must  be.  Attention  is 
now  turned  to  the  selection  of  M(s)  so  that  the  polynomial  YL  + AM  has  degree  6^  < n. 

Let 


Y, 


I 'k 

k=0 


k 

s 


n 

A = y a.  s^ 
1 

i=0 


and 


m 

M = ^ ms 


J 


(2S) 


(29) 


(30) 


1, 

■ ' 

i , 

i, 

I: 

f! 


n 


iwiiiMlilll 
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Then,  in  view  of  Eq.  (18), 


n n 
a m 


AM  = V V = V 

-(  J 1 1 -j  k 

i=0  j=0  k=0 


where 


c,  = a.  .m.U(n  -j)U(n  - k + j) 

j=o 


U(q)  = { 


1,  q > 0 


0,  q < 0 


The  conditions,  hence,  which  assure  that  the  polynomial  YL  + AM  has  degree  6^  are 
‘^k  ~ ’ k = (6^  + 1)  n . (34; 


Since, 


n^  - k + j > 0 


only  if 


j > k - n 


it  readily  follows  from  Eqs.  (32)  and  (34)  that 


y aj^  .m.  U(n^-  j)=  , k=(6^+D-n 

j=k-na 


Equation  (37)  constitutes  a system  of  n - 6^  linear  inhomogeneous  equations  in  the 


n -(6  + l-  n)+l  = n +n  - 6 = n-  6 

m ' r a mar  r 


unknowns 

^n  ' "’n  - !■  •••’  *"6  + 1 -n  ' 

mm  r a 

It  is  easily  shown  that  this  system  of  equations  always  has  a unique  solution  for  the  un- 
knowns provided 

6,  - 6+1-n  = 6+1-  6(A)  > 0 . (40) 

f r a r — 

Note  that  6^  is  the  number  of  remaining  coefficients  in  the  polynomial  M(s)  which 
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can  be  used  to  satisfy  design  objectives  other  than  Equation  (3).  Once  6^  is  chosen,  Eq . 
(40)  fixes  the  value  of  6^  and  it  follows  from  Eqs.  (27)  and  (40)  that 

6(L)  = 6(B)  + ji  + 6^  + 6(A)  - 1 . (41) 

Equation  (41)  gives  the  degree  of  L(s)  in  terms  of  known  data  once  jj.  and  6^  have  been 
selected.  Clearly,  the  condition  6(L)  > 6(B)  + gi  imposed  by  Eq  . (27)  is  satisfied  whenever 

6^  + 6(A)  > 1 . (42) 

Since  almost  all  problems  of  interest  require  some  design  freedom,  6^  > 1 , and  Eq . (42) 
is  satisfied.  It  is  also  rare  to  find  any  practical  cases  in  which  6(A)  = 0.  Hence,  there 
is  little,  if  any,  real  loss  in  imposing  the  restriction,  Eq . (42),  on  the  choice  of  6^, 
and  this  is  assumed  in  the  sequel. 


The  degree  of  the  polynomial  M(s)  is  fixed  by  Eq.  (18).  Specifically, 

6(M)  = 6(Y)  + 6(L)  - 6(A)  (43) 

or,  with  the  aid  of  Eq  . (41), 

6(M)  = 6(Y)  + 6(B)  + p + 6j  - 1 . (44) 

Equations  (41)  and  (44)  are  key  relationships.  The  first  step  in  the  design  procedure  is 
to  select  p and  6^^.  Then  Eq.  (41)  fixes  the  degree  of  L(s).  The  zeros  of  L(s)  are  chosen 
next  keeping  in  mind  that  L(s)  is  a factor  in  the  characteristic  polynomial  for  the  closed 
loop.  Once  L(s)  has  been  picked  and  Y(s)  has  been  selected  so  that  Eq.  (10)  is  satisfied,'' 
the  degree  of  M(s)  is  set  by  Eq . (43).  The  first  n - 6^  coefficients  of  M(s)  are  then 
found  by  solving  Equations  (37).  The  remaining  6^  coefficients  of  M(s)  are  set  by  design 
specifications.  Once  L and  M are  known,  a straightforward  computation  gives  C(s)  from 


, YL  + A.M  _ ®c 

= XL  - "^M  - r" 


(45) 


When  p is  selected  so  that  the  controller  transfer  function  is  proper  or  strictly 
proper,  then  the  minimal  order  of  the  dynamical  system  which  realizes  this  transfer 
function  is  the  degree  of  A^(s),  the  denominator  of  C(s).  It  is  not  difficult  to  show  that 


It  immediately  follows  that 


(46) 


^There  exist  unique  polynomials  X(s),  Y(s)  satisfying  Eq.  (10)  and  6(X)  < 6(B)  - 1, 
6(Y)  < 6(A)  - 1.  See  Ref.  5,  Theorem  4. 
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6(A^)  = 6(L)  - 6(A)  = 6(B)  + ^ + 6^  - 1 (47) 

since  S -*  1 , |sl  ->  oo.  Equation  (46)  is  of  interest  in  those  cases  where  the  degree  of 
C(s)  is  an  important  design  factor. 

When  C(s)  is  improper,  not  only  is  the  degree  of  the  denominator  important,  but 
the  degree  of  the  numerator  as  well.  It  follows  from  Eqs.  (45),  (25)  and  (40)  that 

6(B^)  = 6j.  = + 6(A)  - 1 . (48) 

It  is  interesting  (and  comforting)  to  note  that  both  6(A^)  and  6(B^)  are  independent  of  the 
choice  of  X(s),  Y(s)  satisfying  Equation  (10).  That  this  is  the  case  is  obvious  upon 
inspection  of  Eqs.  (47)  and  (48). 

It  is  possible  for  the  computed  M(s)  not  to  be  relatively  prime  to  the  selected  L(s). 
A factor  common  to  both  YL  + AM  and  XL  - BM  then  exists.  When  the  degree  of  this 
factor  is  denoted  by  6^,  one  obtains  in  place  of  Eqs.  (47)  and  (48) 

6(A^)  = 6(B)  + + 6^  - 1 - 6^  (49) 

and 

6(B^)  = 6^  + 6(A)  - 1 - 6^  , (50) 

respectively.  It  is  also  true  in  this  case  that  Eq.  (12)  is  no  longer  the  characteristic 
polynomial  for  the  closed  loop.  The  correct  expression  is  obtained  from  Eq.  (12)  by 
replacing  L(s)  with  the  polynomial  which  results  when  L(s)  is  divided  by  the  factor  com- 
mon to  M and  L.  It  is  recommended  that  all  designs  be  initiated  under  the  assumption 
that  L and  M turn  out  to  be  relatively  prime:  this  is  the  most  likely  case. 

B.  Discussion 

A straightforward  design  procedure  for  single -input -output  systems  described  by 
rational  transfer  functions  has  been  developed.  The  method  has  been  applied  to  several 
examples  and  these  details  are  contained  in  a paper  to  be  submitted  for  publication. 

The  design  method  is  easily  programmed  on  a digital  computer  and  should  replace  all 
classical  design  techniques  based  on  the  root  locus  method.  The  fact  that  the  closed- 
loop  system  modes  are  selected  by  the  designer  at  the  outset  is  a signifi>.  ant  advantage 
of  the  approach  described  here.  The  ability  to  introduce  the  constraints  Eqs.  (3)  and 
(7)  is  important  practically  and  this  cannot  be  overemphasized. 
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GENERAUZED  IMAGE  RESTORATION  BY  THE  METHOD  OF  ALTERNATING 
ORTHOGONAL  PROJECTIONS 

D.C.  Youla 

The  view  adopted  in  this  report  is  that  the  problem  of  image  restoration  is  essen- 
tially geometric  in  character  and  can  be  formulated  as  follows:  The  complete  image  f 
is  a Rector  known  a priori  to  belong  to  a linear  subspace  but  all  that  is  available  is 
its  projection  P f onto  a known  linear  subspace  P . 1)  Find  the  necessary  and  sufficient 

^ 3l 

conditions  under  which  f is  uniquely  determined  by  P^f  and  2)  find  the  necessary  and 
sufficient  conditions  guaranteeing  the  stable  reconstruction  of  f from  P^f  in  the  face  of 
noise.  (In  the  latter  case  the  reconstruction  problem  is  said  to  be  well-posed.)  The 
answers  turn  out  to  be  remarkably  simple. 

(aj)  f is  uniquely  determined  by  P^f  iff  and  the  orthogonal  complement  of 
have  only  the  zero  vector  in  common. 

(a^)  The  reconstruction  problem  is  well-posed  iff  the  angle  between  and  the 
orthogonal  complement  of  is  greater  than  zero.  (All  angles  lie  in  the  first  quadrant. ) 

(a^)  In  both  cases  (aj)  and  there  exists  an  effective  recursive  algorithm  for 
the  recovery  of  f employing  only  the  operations  of  projection  onto  and  projection  onto 
the  orthogonal  complement  of 

It  appears  natural  to  carry  out  the  proofs  of  (a^)  - (a^)  in  a Hilbert  space  setting. 

A.  Preliminaries 

Consider  a Hilbert  space  with  elements  f,  g,  h,  x,  z,  etc.  , a zero  vector  <(>  and 
an  inner  product  (x.y).  By  definition, 

||f||  = jTf7f7>  0 (1) 

is  the  "length"  of  f and  the  sequence  {fj^]  is  said  to  converge  to  f(fj^->f)  if 

Umit  l|f  -fll  = 0 . (2) 

k-^oc 

Let  P be  any  closed  linear  manifold  (CLM)  in  Jt' and  J_P  its  orthogonal  complement.  Ac- 
cording to  the  projection  theorem^  every  f e possesses  a unique  decomposition 

f=g+h  (3) 

in  which  g e P and  he  Symbolically, 


All  the  general  functional  analysis  needed  to  read  this  section  can  be  found  in 
Reference  1.  Regarding  notation,  the  symbols  e,  Ky  and  1 stand  for  set  member- 
ship, intersection,  union  and  the  identity  operator.  A closed  linear  manifold  is  a 
linear  manifold  which  contains  all  its  limit  points  (and  therefore  equals  its  closure). 
Our  Hilbert  space  M-is  assumed  to  be  complete. 
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€>  + . (4) 

Since  (g,h)  = 0,  the  vectors  g and  h are  mutually  orthogonal  and  the  two  linear  operators 

P and  Q defined  by  the  rules  g = Pf,  h = Qf  are  the  associated  orthogonal  projection 

1 2 

operators  projecting  onto  P and  \_& , respectively.  As  is  well  known,  P = P and 

= Q = 1 - P.  Moreover,  P is  self-adjoint  (P  = P*)  whence  (Px,y)  = (x,  Py)  for  all  x 
and  y in  M. 

Now  suppose  an  element  ItM  belongs  to  a known  CMP  P,  but  we  are  given  only 

sjc 

its  projection  g = P.f  onto  the  known  CLM  P_ . How  do  we  go  about  reconstructing  f 

3.  ""  3 

from  g?  Let  P^,  Q^,  P^,  denote  the  projection  operators  projecting  onto  P^, 

Pj^  and  X^b’  • Then,  f £ implies  f = P^^f  and 

g = P f=P  P^f={l  -Q  )P,  f=p,  f-Q  p,  f=f -Q  P,  f . (5) 

“aab  a'bbab  ab 

Thus  f satisfies  the  operator  equation 

Af  = g (6) 

where 

A = 1 -Q,P.  . (7) 

a b 

Clearly,  Ais  abounded  operator  whose  domain  D(A)  is  all  of  ^ and  whose  range  R(A) 
is  the  set  of  all  vectors  gtM  admitting  the  representation  Eq.  (6)  for  some  choice  of 
it  M. 

The  vector  f is  uniquely  determined  by  g iff  the  inverse  operator 


T = A‘^ 

(8) 

exists  for  then 

f = Tg  . 

(9) 

However,  to  examine  the  matter  of  stable  reconstruction  it  is  necessary  to  evaluate  the 
performance  of  Eq.  (9)  in  the  presence  of  noise.  The  effect  of  noise  is  to  corrupt  g to 


Unless  stated  explicitly  otherwise,  all  projections  are  orthogonal. 


A linear  operator  T is  bounded  if 
TxT 


T 11  s sup 


< 00 


(9a) 


xeD(T)  II ’‘I 


Any  orthogonal  projection  operator  has  norm  equal  to  one. 
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g + Ag  and 


Af  = T(Ag) 


(10) 


is  the  error  induced  in  £.  In  general  there  is  no  restriction  on  the  orientation  of  Ag  in 
J^and  the  reconstruction  process  Eq.  (9)  is  executable  without  exception  iff  the  domain 
of  T is  all  of  M.  Moreover,  to  preclude  the  possibility  of  arbitrarily  large  percentage 
errors  one  must  also  impose  the  condition 


sup 


l|T(Ag)|| 

irar 


< 00 


(11) 


In  other  words. 


T = 


1 A"M1  <00 


(12) 


and  the  inverse  operator  is  forced  to  be  bounded.  The  task  is  to  translate  the  above 

requirements  onto  & and 

a b 

The  algorithm  for  the  recovery  of  f is  inspired  by  the  equation 


f = g + Q^P^f 


(13) 


which  immediately  suggests  the  fundamental  recursion 

^k+1  = 8 + . k = 1 - “ : fi  = g . (14) 

Read  from  right  to  left  Eq.  (14)  states  that  fj^^^  is  created  by  projecting  fj^  onto  then 
projecting  the  result  onto  and  finally  adding  g to  restore  the  correct  projection  onto 
^a'  geometric  significance  of  this  cycle  of  operations  can  be  gras,.ed  very  easily 

with  the  help  of  the  two  diagrams  in  Figures  1(a)  and  1(b). 

The  three  CLMS  and  are  indicated  as  three  straight  lines  of  infinite 

extent  passing  through  the  origin.  Although  the  instrumentality  is  incapable  of  synthe- 
sizing f from  P^f  = OB  = g by  direct  movement  back  along  the  perpendicular  BA  to  ^ , 
it  is  nevertheless  possible  to  reach  intermediate  points  such  as  D,  F,  H,  etc.  , which 
tend  to  the  limit  A.  For  starting  with  OB  and  projecting  it  onto  we  obtain  OC.  Since 
the  projection  of  OC  onto  is  unequal  to  OB  and  therefore  incorrect,  the  next  step  is 
one  of  restoration.  This  is  accomplished  by  projecting  OC  onto  J_P  and  then  adding  the 
result  OC  (=BD)  to  OB.  Thus  OD  is  the  second  approximation  to  f.  Further  repetitions 
of  this  method  of  alternating  projections  yield  the  length-increasing  vector  approxima- 
tions OF,  OH,  etc.  , which  obviously  converge  to  f = OA  provided  the  lines  and 
do  not  coincide.  We  now  proceed  to  show  that  this  heuristic  reasoning  can  be  made 
rigorous . 
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Fig.  1.  The  geometry  of  reconstruction  in  Hilbert  space. 


B.  The  Main  Results 

The  inner  product  permits  us  to  define  the  angle  between  two  elements  f and  g in 
Jl  as  follows.  By  the  Schwartz  inequality  we  know  that 


0 < 


l(f.g)l 


- f 


< 1 


We  may  therefore  define  the 


cos  4>(f,g)  = 


lgl 


angle  4j(f,g)  between  f and  g by 
, 0 < 4'(f.g)  1 j • 


(15) 


(16) 


The  angle  between  two  linear  manifolds  and  is  defined  by  the  ex- 


pres Sion 


s inf  4»(t.g)  • 

‘ ^ f£p, 


(17) 


geP. 


Equivalently,  since  cos  is  a monotone  decreasing  function  of  4j  in  the  interval 


*If  either  or  = [<))}  . cos  s 0 and  4-(^i.^2*  ^ 
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0 < 4<  < ir/2, 

cos  <^(9  ,9^  = sup  i|  • (18) 

^ ^ pH  • llgll 
ge^2 

The  next  three  lemmas  not  only  play  a key  role  in  the  proof  of  theorem  1 but  are  also 
of  considerable  interest  in  their  own  right  and  serve  to  highlight  the  importance  of  the 
angle  concept. 

Lemma  1 . If  > 0,  and  contain  only  the  zero  vector  <()  in  common. 

Proof.  Let  x ^ <(>  belong  to  Pj  Then  xe  xe  ^2*  4'(^i*)  = 1>  4^(x,x)=  0 

and  = 0,  a contradiction,  Q.E.D. 

Lemma  2.  Let  and  P2  denote  the  projection  operators  projecting  onto  the 
CLMS  and  respectively.  Then, 

11  Pj  P2  11  = cos  , ^2>  = II  ^2^1  II  • < ‘ 

Proof.  (We  asstime  that /Pj  ^ [<f>}  and  ^ (*t>3  otherwise  the  lemma  is  trivial.) 
For  any  f £ Pj,  f = P^f  and  for  fixed  g e P2, 

l(f.g)l  l(Pi^>g)l  l(f*Pig)l  llPigll 
iifii-  kirhii-  iig.r  iifii  • iigii  - ITT 

with  equality  iff  f = XP^g  c » X an  arbitrary  scalar.  Thus,  since  g = ^2^’ 


cos  = sup  — 


|P,P,gl 


geP2  II  g I 


From  the  definition  Eq.  (9a)  of  operator  norm  it  is  clear  that  the  sup  on  the  right-hand 
side  of  Eq.  (21)  cannot  exceed  IIPJP2H.  Conversely,  if  h is  an  arbitrary  member  of 
h=  g + X where  g £ ^2  ^*^*1  ^ ® -1-^2 


|PlP2hl 


I Pi  1^28  I 


llPjPzgl 


(iigir+  iixin 


m- 


|PlP2g| 


II'^i'^28II  . ii*’i'’28I 

Tiir- 


implying 
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P.P. 


sup 


1^28  1 


= sup 


ip^gl 


get^2 


9^  llgl 


(24) 


= llP^Pjll,  Eq.  (19)  is  established. 


Since  llPjP^ll  = IKPiP^) 

Lemma  3.  Let  9^  and  9^  be  two  closed  linear  manifolds  such  that  9^  9^  = 

Then,  9=  9^  + 9^  is  closed  if  and  only  if  ^(9^,9^  > 0. 

Proof.*  Sufficiency.  According  to  lemma  2,  <^(9^,9^  > 0 impUes  HP2P1H  < 1* 

Pj  and  P^  the  projection  operators  projecting  onto  9^  and  9^,  respectively.  Recall  that 
9 is  the  set  of  all  vectors  of  the  form  f = x+  y,  xe  y t 9^.  Let  -^f|^=  Xj^+  yj^|  be  any 
sequence  in  ^ converging  to  a limit  f.  What  must  be  shown  is  that  ft  Write 


P-j^i,)  + (yi,+ 


2"k'  ■ '^k'  ‘2  k 

XkS  ^i , ^2’  *k  ■ *"2^‘k^  i^2  ^k^  ^2’ 

1^=  11(1  -PzPiH^k- lUvk- ^2^1<^k' 


I'k-'f 


k,  i = 1 00.  Hence  11  II  "* 

11(1  - P2Pi)(Xi^-  0 


0 implies 


and 


But 


lUvk-  y^)  + P2Pi(’^k- 


(25) 


(26) 


(27) 


(28) 


(29) 
xe^. 


1 


11(1  - P2Pj)(xj^-  x^)||  >(1  - llP2Pili)  ' ll^k'  ’‘f  II 

which  together  with  HP2P1II  < 1 and  Eq.  (27)  yields  Hxj^-  x^  H -»  0.  Thus 
because  9^  is  closed.  Invoking  Eq.  (28),  Hyj^  - y^  H -^0  whence  Yj,  -»  V £ ^2  ^2 

is  closed.  It  now  follows  from  fj^=  Yj^  that  i = x+  y e Q.  E.  D. 

Necessity  (the  difficult  part).  Suppose  9 = 9^  + 9^is  closed.  Then  9,  considered 
by  itself  is  a complete  Hilbert  space.  In  view  of  the  assumption  ^ ^2  " ^ 

actually  equals  9^  + 9^,  the  "direct"  sum  of  9^  and  9^.  Every  vector  f in  ^ therefore 
possesses  a unique  decomposition  f=x  + y,  xe^j,  yt9^  and  the  rule  x=  Lf  defines  a 
linear  (oblique)  projection  operator  L projecting  9 onto  9^  parallel  to  9^.  The  domain 
of  L is  all  of  9 and  to  prove  that  it  is  bounded  in  P,  i.  e.  , that 


Ve  assume  that  9^  f {4>}  and  9^  ^ {<(>1  otherwise  the  lemma  is  trivial. 


I 
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^'fTlIRF 


it  suffices  to  establish  that  it  is  closed  in  9 (Ref.  1,  p.  165,  Thm.  8).  Now  for  every 

fe^  we  have  f = Lf  + (f  - Lf).  Let  lim  f.  = f , lim  Lf.  = g , f.c  9,  k = 1 -»<«.  To  verify 

K O iC  O K 

that  L is  closed  in  9 we  must  show  that  g = Lf  . Since  all  f,  are  in  9 and  all  Lf,  are 

“o  o k k 

in  9 and  all  f,  - Lf,  are  in  P_,  we  have  f e g c 9 and  lim(f,  - Lf,  ) = f -g  =he^_ 
1 k k 2 o “o  1 ' k k o ®o  o 2 

because  9,  9.  and  9^  are  closed.  Consequently,  f = g + h,g  c^,  ,h  t 9 and  it 
1 2 ^ ' o ®o  o O 10  4 

follows  that  g^=  I-^qJ  i.  e. , L is  closed  in  P,  Q.  E,  D. 

In  this  final  step  of  the  proof  we  relate  |1l1|^  to  sin  i\i(9^,9^).  Clearly,  since  any 
f in  ^ is  of  the  form  f = x+y,  xe  ye  9^, 

n,n  IlLfll  llxll 

L „=sup-|T^=  sup  ,,  j.  . (31) 

^ £t9  HhI  xe^,  ll*+yll 


With  xe  held  fixed, 

inf  11  x+y II  = distance  from  x to  ^_  = l|Q,x|| 

ye^2 

and  using  x = PjX  Eq.  (31)  goes  into 

, «°2V«  V‘ 

Writing  P^x  = P^Pj  + Q^PjX  we  have  11PjxH^=  HP2Pjx11^+  Hq^PjxH^  from  which 


HQzPi^H  n / 

sup  (l 5 — )=  (l  - inf 

xeP.  ^ llP.xll^  > ^ xeP, 


iQ^PiX 


II 2 / 


iPzPi-l 


= cos  i\i{9^,9^)  , 


by  lemma  2. 


' 9 ~ sin  4^( 


and  IIlII^  < 00  implies  ^{9^,9^)  > 0.  This  completes  the  proof  of  lemma  3,  Q.E.D. 

Theorem  1.  Let  9 and  9.  be  any  two  closed  linear  manifolds  in  M- and  P , Q , 
— a D a a 

p^,  (orthogonal)  projection  operators  projecting  onto  9^,  and  respec- 

tively. Suppose  f £ Then, 
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(aj)  f is  uniquely  determined  by  its  projection  P^f  onto  if  and  only  if 

^ ~ • (34) 

(a^)  The  problem  of  reconstructing  f from  is  well-posed  if  and  only  if 
+(^b*  ^ ° • (35) 

This  angle  constraint  is  satisfied  iff 

ll^^a^bll  - P < * • (35a) 

(a^)  In  both  cases  (a^)  and  (a^),  the  sequence  [fj^]  generated  by  the  recursion 
^k+1  = g + * k = 1 ^ 00  ; fj  = g = P^f  (36) 


converges  to  f in  norm,  i.  e. , 


limit  f,  = f 
k^oo  ^ 


(37) 


The  convergence  is  strict  monotone  increasing ; i.e.,  t f- 

Proof-  (a.)  According  to  Eqs.  (6)  and  (7),  Af  = g where  A=  1 -Q  P The  vector 

^ I 3.  D 

f is  uniquely  determined  by  g iff  A~  exists;  i.e.  , iff  the  equation 

Af  = (1  -Q  P,  )f  = S 
o'  a b'  o ^ 

possesses  the  unique  solution  f^=  <|>.  Clearly,  f^=  Q^P^f^  implies  l|f^H  = llQ^P^jf^H 

which  is  possible  iff  f e ^(1^  ).  Conversely,  any  f £ ^,  ( I P ) satisfies  Af  = d> 

uD’**a  OD-^a  o” 

whence,  A exists  iff  (J_  = {<)>},  Q.E.D. 

{a^}  If  the  reconstruction  problem  is  well -posed  T = A'^  must  exist  and  be  bounded 
The  domain  D(T)  of  T is  the  range  of  A which  is  the  collection  of  all  vectors  g of  the 
form 

g = f-Qa*b^  . fe-^  • (38) 

The  existence  of  T implies  that  D(T)  is  dense  in  M.  Suppose  to  the  contrary  that  some 
htJi'ia  orthogonal  to  D(T).  Then, 

0 = (h,Af)  = (A*h,f)  = (1  -P^Q^)h,f) 

for  all  tejl.  Hence  (1  - F^Q^)h  = <|)  or  h = I^Q^h  and  reasoning  as  before  he 
From  (a^),  h = <J>  proving  that  D(T)  is  dense  in  M-. 

Let  us  now  impose  the  additional  condition  that  T be  bounded  and  let  g be  an 
arbitrary  element  in^.  Since  D(T)  is  dense  in  there  exists  a sequence 


.V.. 


552 


SYSTEMS,  CONTROL  AND  NETWORKS 


*k='k-“.n>'k  ■ 

which  is  contained  in  D(T)  and  converges  to  g.  From  fj^=  Tgj^  and  the  boundedness  of 
T we  conclude  that  the  sequence  also  converges  to  some  leM-.  Going  to  the  limit 
in  Eq.  (39), 

g = £-QaPj,f=(l  -QaPb)f  • (40) 

Thus  every  g e R(A)  and  the  domain  of  T is  the  entire  spaced/.  Writing  f = I^f  + 
it  follows  from  Eq . (40)  that  every  g eM  admits  the  representation 

g = Qj^f  + P^Pj^f  = x+y  (41) 

where  x = Q,  f c I and  y = P R f e P . In  other  words , ( I ^ ) + P = M,  a CLM . But 
b->-b  'aba  _ ’'-*-a  a 

then  (Ref.  2,  Thm.  4.  8),  J_(J_Pjj)  + “ ^b  ^ -L^a  closed  and  invoking  lemmas 

3 and  2,  \_^g)  ^ 0 or, 

P = llQaPbll  = ^<^b’i^a'  < ^ 

Inversely,  if  ^ ^b  ^ <i^a  ) = {(|)]  by  lemma  1 and  the  explicit  solution 

of  Eq.  (40)  is  given  by 


r=o 

which  converges  like  a geometric  series  since  llQa^b^^  = p < 1.  From  Eq.  (41),  if 
f e then  g = P^Pj^f  e Again,  if  g e P^'  ^a^b^  “ implying  that  Q^f  belongs  to  both 

1 P.  and  P . Thus,  if  (1  P.  ) r\  9 ■=■  {(|)}  , Q f = {<|)}  and  f e P,  . Consequently,  under  the 

^DcL  D D 

additional  (but  unnecessary)  assumption  (J_Pjj)  P^  = {<)>]  , f E P^  iff  g E P^.  Finally, 
from  Eq.  (40) 


iQa^bl 


which  coupled  with  Eq.  (42)  yields 


1 

- 1 - cos  4<(P'^. 


(44) 


(a^)  If  g is  in  the  range  of  A it  has  the  form  g = f - Q^P^f.  Iterating  Eq.  (36)  we 
find  that 

'k ' "t*  ' V w,Pb''  <'  - °.n.n  ■ ' - ■ 

r=o  r=o 


(45) 
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According  to  von  Neurr.ann's  theorem  on  alternating  projections  (Ref.  3,  Thm.  13.7), 
for  every 


Umit  (Q^Pb)  f = 
k-*oc 


(46) 


where  f^  is  the  projection  of  f onto  the  CLM  (J_P^).  Since  = W,  f ^ = <t>  and 

lim  fj^='"f  in  both  cases  (a^  and  (a^).  That  t ^ is  easily  seen  from  Eq.  (45).  The 


error 


,k=l-cc 


(47) 


Clearly,  = (QaPb>«k^"‘^  ""k+l"  - ""k"  '^'ith  equality  iff  r^  i.e.,  iff 

e = <t>.  It  follows  that  if  k^  is  the  smallest  integer  for  which  11  e Jl  = llsj^^ll.  th®*' 
e^<t.,  k>k^and  < lUJl,  k<k^;i.e.,  theorem  1 is  proved,  Q.E.D. 

As  we  have  seen,  if  1^^)  = 0 the  reconstruction  problem  is  unstable 

( ||A'^  11  = oo)  and  the  numerical  performance  of  Eq.  (36)  depends  on  the  fine  details  of 
the  noise  and  its  specific  "orientation"  with  respect  to  the  CLMS  and  However, 

if  4j(Pb*  1^  ) > 0 a satisfactory  error  analysis  is  possible. 

Corollary  1.  Let  noise  corrupt  g = P^f  to  g + Ag  and  let  {f^)  denote  the  sequence 
of  approximations  generated  by  the  recursion  Eq.  (36)  under  the  initialization  f^  = g + Ag. 
Let  f^  = lim  fj^  and  - f = Suppose  \_&^)  > 0.  Then, 

(1)  if  £c  Ag  e and  (J^«^^)  ^ * {<(>1  • 


lAfll 


lAgl 


n — 1>  n 

- sin 

(2)  Without  any  restrictions  on  f,  Ag  and  ^ 

llAgli  + llQ.Qb^ll  + (llAgll  + llPaflD  • cos*" +(^b- i^a^ 


(48) 


Ifk-f] 


1 - cos  4j(€  , ) 


(49) 


k = 1 -*  <». 

Proof.  (1)  Clearly,  since  fe  ^b  ® ~ ^a^’  ® 8 = Af^  and 

Af  = A'^Ag).  If  Age  P , llAfll  < ji  • H Ag  H where 


lA'^xl 


fj,  = sup 
xeP„ 


Let  X = Ay.  As  we  already  know,  under  the  assumption  (J_Pb^  ^ ^a  " ^a 


(50) 
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y e and  we  can  therefore  write 

.up  IWl  = .up  -JxL= , (51) 

l|y  -QaVll  l|Pa%yll  sin +(^b*  i«>a) 

establishing  Eq.  (48),  Q.E.D, 

(2)  Let  gj  = A£.  Then,  P^g  ^ = P^f  = g and 

-8l)  - Z ^ . 

r=k 

Or,  since  g - gj  = -0^81=  ‘ ^a^b>^  = '°aV’ 

f = A'^Ag  - Q^Q^^f)  - (Q^P,^)‘"a-^PJ  + Ag)  . (52) 

Combining  Eq.  (44)  with  HQ^P^H  = cos  X^a^  immediately  derive  Eq.  (49)  by 

norming  both  sides  of  Eq.  (52),  Q.E.D, 

Comment.  The  second  and  third  terms  in  the  numerator  of  (49)  account, 
respectively,  for  aliasing  (which  invalidates  the  assumption  f e ^^)  and  truncation  error. 
Also  observe  that  under  the  conditions  prevailing  in  (1),  the  bound  in  Eq.  (48)  is  sharper 
than  that  obtained  by  setting  Q^f  = 4>  and  letting  k oo  in  Eq.  (49). 

In  part  (a^)  of  theor.jm  1 it  has  been  shown  that  the  recursion  Eq.  (36)  converges 
to  f for  every  g in  tne  range  of  the  operator  A = 1 - even  if 

proof  hinges  on  the  uniqueness  criterion 

^b  <l^a>  = W (53) 

of  (aj).  As  expected,  ifg/R(A),  lim||fj^||  = oo. 

Corollary  2.  Assume  that  Eq.  (53)  is  valid  and  the  vector  g is  not  in  the  range 
of  A.  Let  {fj^}  denote  the  sequence  of  approximations  generated  by  Eq.  (36)  under  the 
initialization  f ^ = g.  Then, 

Umit  Ilf.  11  = 0^  . (54) 

k-*oc 

Proof.  If  Eq.  (54)  doesn't  hold,  lim  infHf,_ll  < <»  and  there  exists  a subsequence 

^ 1 

{fu'}  which  is  bounded  in  norm.  Since  Hilbert  space  is  weakly  compact  one  can  ex- 

K r ■>  * 

tract  from  this  subsequence  another  subsequence  [f j which  converges  weakly  to  some 


£,//-»•  f if  Lim(f,//,x)  = (f  ix)  for  every  xcM. 
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element  f e M-.  i.e.,  V . Now  from  Eq.  (36), 


<>  - Oa^b''!,  = S (‘  - Qa^bllOaPbl's  = 8 ' l°aPb>''8 


and  using  Eq.  (53)  and  von  Neumann's  theorem  on  alternating  projections. 


limit (1  - Q-?,  )£^=  g 
k-^oo  a b k 


However,  because  1 - Q P is  a bounded  operator,  (1  - Q P.  )f.  « -»•  (1  - Q R )f  . But 

ab  ^ abk'  a bo 

according  to  Eq.  (56),  (1  - Q R )f,  « g and  since  the  weak  and  strong  limits  of  a se- 

2 a D K 

quence  must  coincide,  (1  - = g-  This  means  that  gc  R(A),  a contradiction, 

Q.E.  D. 

To  pin  down  the  source  of  the  blow-up  in  Eq.  (54),  suppose  Af  = g and  Eq.  (36) 
is  initialized  with  f ^ = g^+  Ag.  Then,  for  k > 1 

‘k^'o-lOaPbl^^Vl 

where 

k-1 

\-l=  S (Qa^b)"  • • (58) 

r=o 

If  g^+  Ag^R(A),  corollary  2 yields  limH6j^  j II  = since  lim(Q^P^)*^f^  = <}).*  Now 
in  the  error  expression 

■k=‘o-‘k='°a'b>\-\-l  • 159) 

the  norm  of  the  first  term  is  monotonically  decreasing  and  it  is  evident  that  the  algo- 
rithm should  be  terminated  after  a certain  "optimal"  number  of  steps.  Unfortunately, 
the  exact  number  depends  both  on  the  particular  under  consideration  and  the  growth 
rate  of  ||6j^_j|l  versus  the  decay  rate  of  ||  (Q^P^)^f^  || . Although  the  situation  is  not 
completely  desperate  it  is  definitely  unsatisfactory  but  an  in-depth  discussion  must  be 
reserved  for  a future  publication. 

C.  Discussion  and  an  Example 

It  appears  that  in  the  majority  of  appEcations  the  modeling  of  image  restoration 
problems  is  usually  carried  out  with  the  aid  of  simpEfying  physical  ideaEzations  which 


Note  that  Em||6j^_j|(  = <x>  for  Ag  / R(A)  irrespective  of  the  size  of  |lAg||. 


. 41 
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invariably  endow  the  operator  P P,  with  a set  of  nonzero  eigenvalues  possessing  0 as  a 

S D 

limit  point.  All  these  problems  are  improperly  posed. 

Theorem  2.  Let  0 be  a limit  point  of  the  set  of  nonzero  eigenvalues  of  the  operator 
P^P^.  Then,  = 0. 

Proof.  By  hypothesis,  there  exists  a sequence  of  nonzero  scalars  V with  limit  0 
and  a sequence  of  vectors  such  that 


P P.  X. 
a b 1 


X..X. 
1 1 


i = 1 -»  00 


(60) 


Clearly,  \.  ^ 0 implies  x.  e ^ . Thus,  x.  = P x.  and  the  restriction  of  P P,  to  P coin- 

I 12l  l&l 

cides  with  the  (positive)  self-adjoint  operator  The  sequence  [x^l  can  therefore 

be  chosen  orthonormal^  and  as  a consequence,  the  sequence  is  orthogonal.  Ex- 

plicitly, 


(60a) 


i,j  = 1 -»  00.  In  particular. 


\ = 11  Vi  11^  > ° ' i = 1 “ • 

To  calculate  J_^a^  employ  the  formula 


cos 


+<^b’  i^a>  = 


sup 

«^b 


(61) 


(62) 


given  in  Eq.  (24)  and  substitute  P^^i^^^b^  Eqs.  (60)  and  (6l), 


and  it  follows  that  the  sup  on  the  right-hand  side  of  Eq.  (62)  equals  1;  i.e.  , 

Q.E.D. 

Comment  1.  Norming  both  sides  of  Eq.  (60)  it  is  seen  that  1 > X.  with  equality 

iff  X. £ P.  rN  P . Hence,  P,  rN  P = {*]  implies  1 > X.  > 0,  i = 1 -»  oo.  Moreover,  if 
1 b a D a 1 

P^  ^ f ih®  completeness  of  the  xJs  in  P^  also  entails  that  of  the  I^x^'s  in 

P^.  Taking  the  orthogonality  of  the  sequence  {P^x^}  into  account,  it  sxiffices  for  the 

proof  to  show  that  h = (j)  is  the  only  vector  in  P^^  satisfying  (h,  Pj^x^)  = 0 for  all  i.  For 

such  an  h,  (P  h,  x. ) = 0,  i = 1 ->  oo  and  from  the  completeness  of  the  x. 's  in  P , P h = (|). 

a 1 1 a a 

Thus,  he  P|^  ^ h = <|),  Q.  E.  D.  The  reader  has  now  undoubtedly  perceived 

that  under  the  assumptions 


SYSTEMS,  CONTROL  AND  NETWORKS 


557 


4 

the  X. 's  enjoy  all  the  remarkable  properties  of  the  prolate  spheroidal  wavefunctions . 
These  properties  simply  reflect  the  structure  of  P P.  as  the  composition  of  two  ortho- 
gonal  projection  operators. 

Example.  Consider  the  Hilbert  space  J/of  all  functions  f(t)  square -integrable 
over  -oo  < t < oc  equipped  with  the  inner  product 

00 

(f,g)=  / f(t)g*(t)dt  . (65) 

-00 


To  indicate  that  f(t)  and  F(u>)  are  Fourier  transform  pairs  we  write  £(t)  F(w).  Of 

course, 

00 

F(u)  = / f(t)e*^“*dt  (66) 

-00 


is  the  L^- Fourier  transform  of 


For  a >0  prescribed  and  finite,  let  P denote  the  subset  of  composed  of  all  func- 

a 

tions  which  vanish  (a.e.)  in  jt]  > a.  Similarly,  for  a prescribed  finite  b > 0 let 
contain  all  functions  whose  Fourier  transforms  vanish  (a.e.)  in  |o)|  > b.  It  is  easily 
shown  that  1)  and  are  CLMS,  2)  that  is  the  set  of  all  functions  vanishing  (a.e. ) 
in  |tj  < a and  3)  that  is  the  set  of  all  functions  whose  Fourier  transforms  vanish 
(a.e.)  in  |w|  <b.  Furthermore,  letting 


ga<t)  = 


1. 

0. 


lt|  <a 
Itl  > a 


(67) 


and 


gb<«-)  = 


1, 

0, 


|w(  < b 
|u|  > b 


it  is  evident  that  if  f(t)  «— ♦ F(w),  P f = g (t)f(t)  and  P,  f ♦-»  g,  (w)F(w).  Explicitly, 

A & D D 


-D  -00 

Thus,  is  the  set  of  ail  functions  bandlimited  to  the  (radian)  frequency  interval 
|w|  < b and  the  set  of  all  functions  time-limited  to  the  interval  |t|  < a.  The 


(69) 


The  function  g*(t)  is  the  complex  conjugate  of  g(t). 
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corresponding  image  restoration  problem  can  be  stated  as  follows:  A function  f(t)  is 
known  to  be  bandlimited  to  the  interval  -b  < w < b but  only  its  segment  over  -a  < t < a 
is  available.  Reconstruct  f(t)  for  [t|  > a.  In  our  formulation,  «ind  the  projection 

g = P^f  = g^(t)  f(t)  onto  is  given.  By  theorem  1,  the  reconstruction  of  f(t)  can  be  ac- 
complished with  the  aid  of  the  recursion  Eq.  (36)  provided  = {<!>}•  But  this 

is  obvious  since  any  f(t)  c is  automatically  an  entire  function  of  the  complex  variable 
t and  can  vanish  in  |t|  < a iff  it  is  the  zero  function.  (The  same  reasoning  also  yields 
^ ^a  = = <i^b>  r\  Hence,  translating  Eq.  (36)  into  words,  it  is  seen  that 

the  sequence  {f.  (t)3  generated  by  the  2-rule  program  given  below  converges  to  f(t)  in 
* 

norm: 

(R.)  Set  f,(t)  = g (t)f(t),  the  initial  datum. 

X 1 cL 

(R^)  For  k > 1 bandlimit  fj^(t)  to  the  interval  |w|  < b (use  a low-pass  filter  or 
FFT)  and  correct  the  resultant  waveform  over  jt]  < a to  agree  with  fj(t).  Call  the  final 
function  and  repeat  (R2)* 

This  2-step  procedure  which  emerges  as  a special  case  of  theorem  1 is  precisely 
the  new  algorithm  published  recently  by  Papoulis  who  based  the  proof  of  convergence 
on  the  detailed  properties  of  the  prolate  spheroidal  wavefunctions.  The  latter  are 
eigenfunctions  of  the  operator  P R . In  fact,  for  x.(t)  t 9 , 

& D 13. 


P,  X. 
b 1 


sin  b(t-  t) 

ir(t-T) 


X.(T)dT 

1 


and  P P X.  = X.  is  equivalent  to 
a b 1 11  ^ 


(70) 


/ 


sin  b(t-T) 
ir(t-T) 


x^(t)  dT  = X.x^(t) 


-a  < t < a 


(71) 


4 4 

a familiar  integral  equation.  As  is  well  known,  1 > > 0 for  i = 1 -»  oo,  X^  0 and  the 

X. 's  are  complete  in  9 . Since  9 r\  (1  ^_)  = [4>]  » complete  in  9 In 

1 3 Dl  D 

addition,  invoking  theorem  2,  4'(^k»  ® which  means  that  the  reconstruction  prob- 

D ^ S 

lem  is  improperly  posed,  a conclusion  also  reached  by  Viano  in  a different  way. 


As  is  easily  shown,  it  also  converges  to  f(t)  uniformly  over  -oo  < t < oo  (Ref.  5). 

But  interestingly  enough,  by  exploiting  the  completeness  of  the  P^x^^'s  in  9^  and 
using  Eq.  (60),  the  formula 


cos  +(^K* 


yields  ^^)  = cos*^^/X^.  Since 

Landau  and  Pollack.^ 


(71a) 


> Xj  > 0 we  have  0 < ^ result  due  to 
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AN  APPROACH  TO  THE  PATTERN  CLASSIFICATION  AND  ESTIMATION  PROBLEM 
USING  THE  CONCEPT  OF  STOCHASTIC  DIFFERENTIAL  EQUATIONS 

L.  Kurz 

Grenander^  has  considered  one  approach  to  the  problem  of  employing  statistical 

methods  to  pattern  classification.  His  development,  which  makes  use  of  his  earlier 

2 

work  in  statistical  inference,  can  be  divided  into  three  somewhat  arbitrary  sections. 
First,  one  selects  a method  for  describing  the  pure  or  undisturbed  images  so  that  a 
mathematical  structure  is  available.  Second,  the  deformation  or  noise  disturbances  of 
the  pure  images  are  characterized.  Third,  a selection  scheme  or  decision  rule  for 
distinguishing  between  images  is  chosen.  These  three  sub-problems  are  analogous  to 
the  problems  of  alphabet  description,  channel  characterization  and  receiver  structure 
selection  which  arise  in  communication  systems.  The  description  of  the  pure  images 
and  the  means  for  their  generation  as  well  as  certain  transformation  properties  are 
accomplished  by  setting  up  an  appropriate  algebraic  structure.  The  probabilistic  de- 
formation grammar  is  introduced  so  that  appropriate  measures  can  be  used  to  describe 
the  disturbances  and  provide  a background  in  which  the  methods  of  statistical  inference 
can  be  applied  to  pattern  classification  problems.  Because  the  goals  of  the  development 
are  broad,  the  choice  of  a decision  rule  is  arrived  at  from  an  intuitive  approach  of 
distance  and  maximum  likelihood.  As  such,  the  well  known  Random  Nikodym  derivative 
(RND)  or  generalized  likelihood  ratio  is  introduced  and  serves  as  the  basis  of  the 
decision  rules. 

The  RND  may  be  used  successfully  in  classification  and  maximum  likelihood  esti- 
mation (MLE)  problems.  In  particular,  a method  of  finding  the  appropriate  RND's  of 
a one -dimensional  Gaussian  process,  which  is  equivalent  to  the  Wiener  process  or  one 
of  its  variants,  is  embedded  in  an  n-dimensional  Markov  process  which  is  the  solution 
of  a vector  stochastic  differential  equation.  Through  this  embedding  procedure,  one 
makes  use  of  the  structure  of  vector  Markov  processes,  their  equivalence  conditions 
and  the  corresponding  RND.  Though  the  procedure  applies  to  classification  and  estima- 
tion problems,  specific  examples  will  be  given  for  MLR  only.  The  extension  to  clas- 
sification problems  is  routine.  Details  are  given  for  a system  transfer  function  describ- 
ed by  an  all-pole  model.  Other  models  may  be  treated  in  a similar  manner. 
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and  a(t,  4(t))  and  4(t))  are  n-dinnensional  vector  coefficients.  If  the  initial  condition 
is  specified,  the  solution  to  Eq.  (1)  is  the  process  with  sample  functions  ^(t)  which  are 
given  by 

-t  I t 

&(t)  = 4(0)  + j a(s,  4(s))ds  + y J b.  (s,  4ls))dW  (s)  , t > 0 (2) 

0 k=l  0 k - 

The  question  of  existence  and  uniqueness  of  solutions  to  Eq.  (1)  are  answered  by  the 
following  theorems. 

Theorem  1:  Suppose  that  the  following  conditions  are  fulfilled: 

1.  4(0)  does  not  depend  on  the  processes  Wj^(t),  • • • , W (t), 

2.  For  every  c > 0 there  exists  an  such  that  for  jx]  < c,  jy]  < c,  the  inequaUty 


T|a(t,x)  - a(t,y)|^  + ^ lbj^(t,  x)  - bj^(t,  y)  | ^ < L |x-y| 


is  fulfilled. 


3.  There  exists  a K such  that 


|a(t,x)|^+  ^ ibj^(t,x)|^  < K(1  + jxj^) 


4.  a(t,x),  bj(t,x),  • • • , b^  (t,x)  are  continuous  with  respect  to  the  set  of  variables 
xeR  , tE  [0,T],  then  Eq.  (1)  has  a continuous  and  bounded  solution  with  probability  one. 
Further,  the  solution  is  unique  in  the  sense  that  any  two  such  solutions  coincide  with 
probability  one  at  all  points. 

Theorem  2:  If  the  conditions  of  Theorem  1 are  satisfied,  the  solution  to  Eq.  (1) 
is  a Markov  process. 

Suppose  that  4j(t)  and  42^^)  solutions  of  the  stochastic  differential  equations 
-t  i t 

4i(t)  = 4i(0)  + / a.  (s,  4 (s))ds  + £ / b (s,  4 (s))dW  (s)  , i=l,2  (5) 

0 k=l  0 ^ 

the  coefficients  of  which  satisfy  the  conditions  of  Theorem  1.  The  following  theorem 
establishes  sufficient  conditions  for  the  equivalence  of  the  measures  (Xj  and  cor- 
responding to  the  processes  4j(t)  and  42(*)»  also  a general  expression  for  the 
associated  RND. 

Theorem  3:  Suppose  that  the  following  conditions  are  satisfied: 
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1.  There  exists  functions  a^(t,x),  • • ■ , (t,  x)  such  that 

a2(t,x)  - aj(t,x)  = ^ <ij^(t,x)  bj^(t.x)  (6) 

k=  I 

2.  The  functions  aj(t,x),  • • • , (t,x)  and  a(t,x)  given  by 


a(t,x)  = - j Tj  af(t.x) 

^=1  ' 

are  continuous  over  the  set  of  variables, 


(7) 


3.  The  distributions  of  ^^(0)  and  absolutely  continuous  with  respect  to 

one  another  and  the  density  of  the  distribution  of  with  respect  to  the  distribution 

of  |j(0)  is  Pq(x),  then  the  measures  and  ^2  equivalent  and 


Equation  (8)  is  a general  expression  for  the  log  of  the  RND  of  two  equivalent  n-dimen- 
sional  Markov  processes.  It  will  be  necessary  to  use  this  equation  to  derive  an  expres- 
sion for  the  RND  corresponding  to  two  equivalent  one-dimensional  processes  which  are 
components  of  |j(t)  and  facilitated  by  the  following  theorem. 

Theorem  4:  Suppose  that  ^^(t)  and  ^^®  processes  defined  on  [0,T]  and  taking 

values  in  r”.  Denote  by  *j^[0,T]  the  set  of  all  functions  x(t)  that  are  defined  on  [0,  T] 
and  that  take  values  in  R*’  . Denote  by  F^[0,T]  the  minimal  o -algebra  of  subsets 
$^[0,T]  that  contains  all  cylindrical  sets.  Let  7 be  a F^[0,T]  x F^,  [0,T]  measurable 
mapping  of  * [0,T]  into  4>^/[0,T].  Suppose  that  Tij(t)  = 74j(t),  'n2(t)  = the 

are  measures  corresponding  to  the  processes  ^(t)  on  F^[0,T]  and  that  are  measures 
corresponding  to  the  processes  T)^(t)  on  F^,[0,T],  then  if  is  equivalent  to  |jij,  it 
follows  that  V2  is  equivalent  to  v j and 


^(Tij(t))  = E[^(|j(t))/T>j(t)  , tc[0,T]]  (9) 

B.  Solution  of  Estimation  Problems 

The  general  vector  Markov  process  is  specified  by  Eq.  (2)  which  in  turn  is  com- 
pletely determined  by  the  coefficients  a(t,  bj^(t,  %(t))  and  the  initial  condition  ^(0). 

For  Gaussian  processes  certain  significant  simplifications  occur.  If  |(t)  is  Gaussian, 
then  the  components  of  a(t,  4(f))  cal'  only  be  linear  functions  of  the  components  of  4(t) 
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•n  ' ' 


which  satisfies  the  vector  equation  having  coefficients 


W^")(t) 
-n  ' ' 


a(t,  m)  = 


; b(t.  |(t))  = 


•^n  ' 
~n 


and  the  initial  condition 


where  the  Ti^'s  are  independent  N(0,  1)  random  variables  and  the  (n+1)®  -component  of 
Eq.  (17)  is  the  original  one -dimensional  process.  It  is  important  to  note  that  although 
the  formal  differentiation  of  Eq.  (10)  is  valid,  the  correct  interpretation  of  the  differ- 

4 

ential  of  a Markov  process  requires  the  use  of  Ito's  formula.  Equations  (10)  through 
(19)  characterize  the  free  Wiener  process  [W^(t)}  . By  simply  changing  the  initial 
conditions  associated  with  these  equations,  we  obtain  the  equations  for  the  irdegrated 
Wiener  process  [W^(t)l  . The  desired  equations  have  the  coefficients 
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a(t,  |(t))  = 


0 

W*"^(t) 

n 

i ; b(t.  I(t))  = 

( 

1 

* I 


n ' 


(20) 


and  the  initial  condition  4(0)  = 0 with  probability  one.  Equations  (10)  through  (20)  are 
of  a simple  form  and  describe  processes  that  are  generated  by  a "driving  process" 
which  is  a one -dimensional  Wiener  process.  By  choosing  different  coefficients  than 
those  of  Eq.  (18),  we  obtain  different  processes. 


C.  Likelihood  Ratios  (RNDs)  For  Estimation  of  Parameters  of  Lumped  Systems  With- 
out  Numerator  Dynamics 

Consider  the  process  {x(t)]  having  power  spectrum  of  the  form 


S(<o)  = 


1 


(J-  + 


— T~ 

Sj)(t.‘^  + 


■((0^  + 


s') 

n 


(21) 


It  is  known  that  the  process  {x(t)}  is  (n-l)-times  differentiable  and  equivalent  to  W^.|(t)- 
Although  [x(t)]  is  not  a one -dimensional  Markov  process,  the  n-dimensional  process 
consisting  of  x(t)  and  its  (n-1)  derivatives  is  Markov.  To  find  the  appropriate  vector 
equation,  consider  the  vector  process 


i2(t)  = 


(n-1) 

(n-2) 

X 


(t) 

(t) 


x(t) 


(22) 


The  coefficients  ^2(t,  bj(t,  be  found  from  the  infinitesimal  properties 

of  the  process  using  the  formulas 


a2(t,  C2<t))  = 


a2i(t, 


L"2n<^' 


(23) 


r 
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bi  i(t, 


= lim  E 
h|o 


ii^{t+h)  - 


l,(t) 


(24) 


The  conditional  expectations  of  Eqs.  (22)  and  (23)  can  be  evaluated  because  the  variables 
x(t),  x^^V).  • ■ • . are  Gaussian  and  their  n-dimensional  covariance  matrix  can 

be  determined  using  Equation  (21).  Alternatively,  these  coefficients  may  be  determined 
from  the  defining  equations  for  x^^^(t).  For  all  i such  that  0 < i < n-2,  the  sample  func- 
tions x^^^t)  are  differentiable  a.s.  so  that  the  defining  equations  yield  the  coefficients 


and 


a_.  = X 
2j 


bi.  = o 


(n-j+l)j^j  _ dx^'^''^V) 


dt 


(25) 


1 < j < n 


To  determine  a^j  and  bj^,  we  use  the  following  recursion  relation 


with  the  end  conditions 

.(1), 


1 < k < n-2 


and 


yn- j(t)  = x(t)  + ^ (t) 


dW(t)  = yp(t)dt  = Sjyj(t)dt  + dy^(t) 


(26) 


(27) 


(28) 


where  W(t)  is  the  Wiener  process.  Equation  (28)  is  interpreted  in  the  Ito  sense.  The 
correct  results  can  also  be  obtained  through  the  formal  procedure  of  rewriting  Eq.  (26) 
as 


n(t)  = yoW  = ^=s,yi<t>  + ^ 

where  n(t)  is  the  white  noise  process.  Using  these  equations,  we  find  that 

i<"^t)  - E (x^"'^^t)  - m^”'^\t))  S^(i) 
i=  1 


(29) 


&2(t,  ^2^^^^  " 


x("-‘)(t) 

x<-2)(t) 


x^^ht) 


;bj(t,  42(t))  = 


ri' 

I 

i 0 

I 

0 


(30) 


d 
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where  m(t)  is  the  mean  function  of  the  observed  process  x(t)  and 

S^(j)  = sum  of  all  different  products  of  the  parameters  s ^ , s^.  • • • , 
taken  j at  a time. 

F or  example,  ifn=3,  S^{\)  =■  Sj  + 8^+  s^,  S^(2)=  SjS2+  + ^3^^^  ” ®1®2*3‘ 

Now  that  the  coefficients  of  Eq.  (30)  have  been  derived,  the  next  task  is  to  choose 
an  appropriate  n-dimensional  reference  process,  say  |^(t),  which  can  be  used  to  find 
the  desired  likelihood  equation.  Motivated  by  the  facts  that  the  observed  process  x(t) 
is  equivalent  to  ^(t)  and  that  the  b-coefficient  matrix  of  |j(t)  must  be  equal  to  the 
b-matrix  of  Eq.  (30),  it  is  natural  to  select  |j(t)  to  be 


-n- 1 ' 

'y/n-'o*'! 


L^(n-l)<^>J 

This  process  was  discussed  in  Section  B;  its  coefficient  matrices  are  given  by  Equation 
(18),  Using  the  above  equations,  it  is  a straightforward  matter  to  verify  the  conditions 
of  Theorem  3 and  show  that  the  measures  Pj  and  of  the  two  n-dimensional  processes 
|j(t)  and  respectively,  are  equivalent.  If  the  observation  interval  is  [0,T],  then 

the  logarithm  of  the  generalized  likelihood  ratio  is  given  by  Eq.  (8)  which  reduces  to 

dP  T T 

log  ^dj(t))  = log  Po(4i(0))  + ^ a(t,  ^(t))dt+  a^(t,  4j(t))dW(t)  (32) 


where  Qj(t,  4^(t))  satisfies  the  equation 


m^"^t)-  E [W<":'^t) -m<"‘^\t)]S  (i)~] 
i=l 


Oj(t,  4j(t))  = a2(t.  4i(t))  -aj(t,  4j(t))  = 


w^":^^t) 

-n-l  ' ' 


'yil’iw 
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and 


Q (t,  ^(t))  = - E S^(i) 

i=  1 


a(t.  |j(t))  = |j(t)) 


(34) 


(35) 


To  obtain  a useful  form  of  Eq.  (32),  which  can  easily  be  implemented  by  a practical 
estimator,  it  is  necessary  to  evaluate  the  last  term  which  is  a stochastic  or  Ito  integral. 
Although  this  is,  in  general,  a difficult  task,  it  can  be  readily  accomplished  in  the 
present  case.  Because  Eq.  (34)  contains  terms  which  depend  linearly  on  the  derivatives 
of  j(t),  the  integral  can  be  evaluated  by  using  Ito's  formula  and  the  smoothness 
properties  of  the  individual  terms;  the  result  is 

X 

G[Wn.i(t)]  H / Qj(t,  ej(t))dW(t)  = m<")(T)W^’;^;^>(T)-m<"\o)W^';;‘>(0)  + yS^(1) 

^ Q2(n-l)jT)  .y^f2(n-l)^Q)^Tl  . I s (j)  { W<":  ^ ^T)  fw^^T^^T) 

j^n-1  — j n LT““^ 

-m^"'j^0)J+  E S^(j)  / 
j—  2 0 


w ^^t)dt  - f w^":^\t) 

-n-l  -n-1  -i,  -n-i 


E S (j)m^"‘^^'j*(t) 
j-0  " 


dt 


(36) 


where 


S (0)  = 1 for  all  n 
n 

Equation  (33)  represents  the  likelihood  ratio  of  the  measures  of  the  two  n-dimensional 

processes  4j(t)  and  However,  because  we  can  only  observe  the  output  process 

x(t)  we  are  really  interested  in  obtaining  the  likelihood  ratio  of  the  one-dimensional 

processes  P with  respect  to  P,,,  , This  is  accomplished  by  using  Theorem  4 which 

* ”n-l 

yields  the  result 

dP 


dP 


W , 
~n- 1 


(W^_j(t))  = (ei(t))/W^_l(t).  0 < 1<  T j 


(37) 


To  evaluate  the  conditional  expectation  of  Eq.  (37),  we  need  only  observe  that  if  Wj_j(t), 
0 < t < T,  is  known,  then  all  the  remaining  components  of  4j(f)»  0 < t < T,  can  be 
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determined  by  differentiation.  In  other  words,  when  conditioned  on  W^_j(t),  0 < t < T, 
dP^/dPj  is  a constant,  so  that 


dP 


dP, 


dP 


(W  ,(t)) 


W , ' 

•n- 1 


a At)) 


(38) 


Finally,  using  the  facts  that  P^  > P^  and  P^  - Pj.  where  - denotes  equivalence. 


*»  1 
-n-  1 


we  can  replace  ^{t)  by  x(t)  in  t>  e above  equations  to  obtain  the  desired  result 

d 

A = log  ^ 


dP  T 

— (x(t))  = log  ■ ■ • ,x(0)+  / a(t,x^"'^^(t)  • • •x(t)dt  + G[x(t)] 

W , ” 0 

~n-l  (39) 


where  G [x(t)l  is  defined  by  Eq.  (36),  a(t,  ^ \t)  • • • x(t))  is  given  by  Eq.  (36)  and 
as  in  Eq.  (38)  is  the  ratio  of  the  two  n-dimensionai  Gaussian  density  functions. 

D.  Example  of  Application:  MLE  of  Mean  Function  Parameter  for  a Two  Pole  Case 

It  can  be  shown  that  for  a two  pole  system,  Eq.  (39)  reduces  to 

A = jlog  4SjS2(Sj  + 82)^  + [1  - (Sj  + s^)]^  x^^\o)  + [l  - s jS2(s  j + s^)]  x^(0)  - 
- (s^  + S2)x<‘^T)  - SjS2(Sj  + S2)x^(T)  - 2s  ^ s^  [x(T)  x<  ^ ^T)  - x(0)  x<  ^ V)1  + 


+ (Sj+  S2)T  + 2[2{Sj-t  S2)m^^\o)  - f(0)]  x^  ^ ^(0)  + 4 s ^ S2(s  ^ + s^)  m(0)  x (0) 

T 

+ 2f(T)x^‘^(T) -2(Sj  + S2)[m^^^0)+SjS2m^(0)]  - f ^(t)  dt 

T T 

+ 2s,s,  r f(t)  x(t)  dt  + 2 / [(s  +sjf(t)-f^‘^(t)]x^^^t)dt 
- (SJ  + S2)  /^x^<^\t)dt-Sj  S2  /^x^(t)dt} 


(40) 


Consider  the  problem  in  which  Sj  and  s^  are  known  and  it  is  desired  to  estimate  an  un- 
known parameter,  0,  of  the  mean  function.  This  corresponds  to  the  situation  in  which 
there  is  enough  information  available  to  specify  the  pole  locations  in  the  model  used  for 
the  noise  process.  The  transmitted  signal,  or  pure  image,  however,  cannot  becom- 
pletely  specified,  but  is  of  known  form 


m(t)  = m(t;  0) 


(41) 
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It  follows  that  £(t)  depends  on  0 and  is  given  by 

£(t)  = £(t;  0)  = 0)  + (Sj  + S2)m^^^(t;  0)  + SjS2m(t;  0) 

Maximization  of  A with  respect  to  0 results  in  the  following  likeEhood  equation 
0 = { t2(s  j + S2)m^^\o;  0)  - £(0;  0)]  x^^\o)  + 2 s jS2(Sj  + S2)m(0;  0)x(O) 

2 T 

+ f(T;0)x<^V)-(Sj  + S2)[m  0)+ SjS2m^(0;  0)]  - i f^(t,  0)  dt 

T T 

+ S.S,  r £(t;  0)x(t)dt  + f [(s  + S,)f(t;  0)  - £^^\t;  0)]  x^^^t)  dt  I 
1 2^0  0 ^ ^ •’ 

If  m(t;  0)  is  of  the  form 
m(t;  0)  = 0g(t) 

where  g(t)  is  a loiown  function  of  time  and 


f(t;  0)  = 0G(t) 


where 


G(t)  = + (Sj+  S2>g^^^t)  + SjS^  g(t)  (46) 

Under  these  conditions,  solution  of  Eq.  (43)  yields  the  following  unbiased  and  efficient 
MLE  for  0 

0 = [[2(Sj  + S2)g^^^0)  -G(0)]x<‘\o)+2SjS2(Sj  + S2)g(0)x(0)  + G(T)x<^\T)  + 

T T 

+ s^s^J  G(t)x(t)dt  - J [G^^\t)  - (Sj  + S2)C(t)l 

T - 1 

x[2(Sj+  S2)[(g^^V))^+SjS2g^0)]  + GV)dt]  (47) 

This  estimator  depends  on  the  observed  waveform  and  its  derivative  and  can  be  imple- 
mented. 
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A ROBUSTIZED  KIEFER-WOLFOWITZ  PROCEDURE  WITH  APPLICATION  TO 
THRESHOLD  ESTIMATION  FOR  BLOCK  DATA  PROCESSING 

P.  Kersten  and  L.  Kurz 

7 

The  stochastic  approximation  procedures  of  both  Robbins -Monro,  and  Kiefer - 
Wolfowitz^  provide  a class  of  recursive  estimators  which  are  intrinsically  nonpara- 
metric  and  amenable  to  digital  computation.  The  convergence  properties  of  both  of 

3 

these  procedures  has  been  considered  by  many  authors  including  Dvoretzky, 

2 8 

Burkholder,  and  Sacks.  Dvoretzky  considered  the  a.s.  and  mean-square  conver- 
gences of  these  algorithms  and  Sacks  concentrated  on  the  asymptotic  normality  of  these 
procedures . 

Two  modifications,  which  enhance  the  rate  of  convergence  and  the  robustness  of 
the  Kiefer -W olfowitz  (KW)  procedure,  is  the  main  thrust  of  this  report.  The  first 
employs  adaptive  gain  coefficients  to  ensure  an  asymptotically  optimum  rate  of  con- 
vergence and  the  second  exploits  non-Unear  pre-processing  of  the  data  to  immunize 
the  procedure  against  Burst  Noise.  Since  many  real-world  forms  of  noise  have  dis- 
tribution functions  which  may  be  modelled  as  a mixture  distribution  F(x)  = (1  -e)G(x)  + 
eH(x)  where  H(x)  is  a high  variance  CDF,  the  latter  problem  is  of  considerable  interest 
in  communication,  radar,  sonar,  picture  processing,  etc. 

An  application  of  the  robustized  KW  and  Robbins -Monro  (RM)  procedures  to  the 
problem  of  threshold  estimation  for  block  data  processing  is  also  presented. 

A,  A Robustized  Kiefer -Wolf owitz  Procedure 

The  KW  procedure  requires  two  positive  null  sequences  and  two  samples  per 

iteration  to  converge  to  the  maximum  (minimum)  of  the  regression  function.  The  first 

of  the  null  sequences  is  {a  (X  ,,•••,  X ) ] where  a (• ) = -A.(X  ,,•••,  X )/n  and  Za  (')-«> 

-1  ^nl  n n 1 n'  n 

for  every  sequence  X ^ , • • • , X^,  • • , and  A(X  ^ , • • • , X^)  is  a suitably  selected  nonpara - 

metric  mean-square  consistent  estimator  of  some  constant  A.  The  second  positive 

• 2 “2  2 

null  sequence  is  chosen  s.t.  Zn  c^  A^(X^,  • • • » for  every  sequence 

X,,‘--,X  ,••.  The  recursion  considered  is 
1 n’ 


-1 


n+1 


a (• ) c 
n'  n 


(Y(X  - c ) - Y(X  + c )) 
' n n ' n n 


where  Y(X^  ± c^)  is  a random  variable  whose  conditional  distribution  given  = 

x,,  • • • ,X  = X is  the  same  as  the  distribution  of  Y(X  ± c ).  It  is  usually  assumed 
Inn  n n 

that  Y(X^  - c^)  and  Y(X^  + c^)  are  conditionally  independent,  i.e.  , for  all  Borel  sets 
A and  B, 

P(Y(X  + c )eA,  Y(X  - c )eB|X  ) = P(Y(X  + c )eAlX  ) P(Y(X  - c )eB|X  ). 
' ' n n'  ’ ' n n'  ' n'  ' ' n n'  ' n'  ' ' n n'  ' n' 
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Following  Ref.  8 this  assumption  is  not  necessary.  Let  Z(x)  = Y(x)  - M(x)  and  denote 

M(X  - c ) - M(X  + c ) by  M and  Z(X  - c ) - Z(X  + c ) by  Z then  the  recursion  be- 
n n ' n n'  ' n n n'  ' n n'  ^ n 

comes 


n+1 


As  a result  of  the  assumptions  on  the  distribution  of  Y(X^±  c^)  and  the  fact  that  EY(x)  = 
M(x),  EZ(x)  = 0 for  all  x and  the  conditional  distribution  Z^  given  Xj  = x^,  • • • ,X^  = x^ 
is  the  same  as  the  distribution  of  Z(X  - c ) - Z(X  + c ).  This  implies  that 

E(Z^lZj.---.Z^  j)  = E(Z^|Xj,-.-,x”)=”owpl.  ” 

The  first  four  assumptions  deal  with  the  behavior  of  the  regression  function  M(x) 
and  are  similar  to  those  made  by  Kiefer  and  Wolfowitz.^  Intuitively,  these  conditions 
require  that  outside  a small  neighborhood  of  the  location  of  the  maximum,  say  0,  the 
slope  is  bounded  away  from  zero,  M(x)  should  have  no  plateaus  and  be  almost  quadratic 
when  it  is  expanded  about  0.  Moreover,  in  the  entire  parameter  space  M(x)  must  be 
"smooth  enough"  so  that  radical  slope  changes  cannot  cause  the  recursive  procedure 
to  oscillate  indefinitely.  More  explicitly: 

(1)  M(x)  is  a Borel  measurable  function  which  has  a unique  maximum  at  x = 0 
and  for  0<tQ<tj<t2<<», 

, inf  , (x  - 0)(M(x  - c ) - M(x+  e ))/ e > 0 . 
tj  < lx-0l  < t^ 

0 < e < to 

In  addition,  for  all  x and  suitable  Dj  and  D^t  jM(x+  1)  - M(x)  | < Dj  + D^jx] . 

2 

(2)  For  all  x,  M(x)  = °-q  ■ - 9)  + 6(x,  0)  where  Qq  some  real  number, 

Q > 0,  and  6(x,  0)  = 6(|x-0|^)  as  x-0  -»  0. 

(3)  For  some  Cq  > 0,  there  exists  positive  constants  and  K2  s.t.  for  any  x 
and  for  all  c for  which  0 < c < Cq, 

Kj(x  - 0)^  < (x  - 0)  (M(x  - c)  - M(x+  c))  c ^ < K2(x  - 0)^ 

(4)  For  every  c > 0,  there  exists  c > 0 s.t.  for  all  c satisfying  0 < c < c and 

C I c 

all  X satisfying  |x-0l<c,  l6(x-c,0)-6(x+c,0)lc  <elx-0|. 

The  above  assumptions  limit  the  class  of  regression  functions  for  which  one  has 
convergence  in  law.  In  addition,  assumptions  on  the  additive  noise  Z(x)  are  needed. 
The  first  requires  the  variance  of  Z(x)  to  be  bounded  for  any  x and  the  second  condition 
which  is  virtually  a uniform  integrability  condition  upon  the  sequence  [Z  (X^)],  estab- 
lishes the  Lindeberg  condition. 


r 
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(5)  sup  EZ^(x)  = s < 00  and  Um  E(Z(x  - a)  - Z(x+a))^  = o^. 
X x-*0 

a->0 
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In  case  Z(x-a)  and  Z(x+a)  are  uncorrelated  we  can  replace  the  latter  equation 

lizn 
x-»0 


with  lim  EZ^(x)  = a^/Z. 


(6) 


sup 

c-0|< 


e|  Z(x)  1 < oo  for  some  e > 0 and  v > 0. 


The  last  condition  places  constraints  upon  A(Xj,  • • • in  order  that  it  be 

mean -square  consistent  and  bounded. 

(7)  (a)  0 < a'  = inf  A (• ) < sup  A (• ) = A"  < oo  wpl 

{X.,Ki<n}  {X.,l<i<n} 


1 — — 


X — — 


(b)  lim 


inf 


A^(" ) > Aj  >0  wpl 


n {X^,  l<i<n}  ” 

(c)  E(A  (• ) - A)^  ->  0 as  n ->  oo  s . t.  A,  < A < A"  . 

Theorem  1:  Suppose  that  conditions  (1)  to  (7)  are  satisfied.  Let  AK^  > > l/2 

and  let  a^(X^,  • • • ,X^)  = A(Xj,  • • • ,X^)/n.  Let  {c^},  {a^(- )]  be  sequences  of  positive 

Z -2 

numbers  satisfying  ZA  ( • )/n  = oo  and  2a  ( • ) c < oo  for  all  sequences  X . , • • • , X , • • 

n n n j x n 

and  c -»  0 and  n -»  oo  where  c /c  .,  = 1 + e m where  e 0 as  m -»  oo.  Then 
n m m+ 1 m m 

2 2 

Jn  0)  is  asymptotically  normal  with  mean  0 and  variance  o A /8  aA  - 1). 

Proof:  See  Appendix  of  Reference  1 . 

B.  Adaptive  Threshold  Detection  Using  Stochastic  Approximation 

Because  the  stochastic  approximation  algorithms  are  so  computationally  attractive, 
their  use  in  conjunction  with  adaptive  systems  such  as  variable  threshold  detectors  for 
block  data  is  a natural  outgrowth.  In  this  example,  we  illustrate  the  application  of 
both  the  RM  and  the  KW  procedures  to  a variable  threshold  detector  investigated  in 
Ref.  4 and  based  upon  a modified  version  of  Mosteller's  k-sample  slippage  test. 

The  detector  is  a two-sample  binary  detector  with  a reference  sample  If  ^ = 

(Vjf  ■ ■ ' size  n and  test  sample  Xjj  = (Xj,  • • • The  reference  sample  is  taken 

under  the  null  hypothesis  of  no  signal  present  and  the  n*'^  order  statistic  is  extracted 

and  denoted  by  Y The  slippage  statistic  R is  defined  to  be  the  number  of  X.  which 

' max  ri-  o 1 

are  greater  than  or  equal  to  More  succinctly 


R = y {X.  > Y ] 
i — max 


; X.  'Jt 
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! 


and  the  test  of  hypothesis  is 
n 

; 7T  F(xj^)  i.e.  , no  signal  present 
i=  1 

n 

H.  : 7T  G(x. ) e.g.,  DC  signal  present. 

1=1  ^ 

The  hypothesis  testing  problem  may  also  represent  two  grey  levels  in  a picture  proces- 
sing problem.  We  assume  G(x)  is  stochastically  larger  than  F(x),  so  Hq  is  rejected  if 
R > Rq  where  Rq  e {O,  1,  • • • ,n}.  The  discrete  distribution  of  R under  due  to 
Mosteller  is: 

1 - FQ(r)  = r = 0,l.---,n 

and  under  Hj,  this  distribution  becomes 

1 - Fj(r)  = (J)  /(I  - G(x))’"g”'’'(x)  dF"(x) 

To  simulate  noise  across  picture  edges,  F(x)  is  assumed  to  be  uniform  [-1/2,  l/2]  and 
G(x)  uniform  [-1/2  + a,  l/2  + a]  where  a is  the  slippage  parameter  representing  the 
difference  in  the  grey  levels. 

The  threshold  R„  which  minimizes  the  total  probability  of  error  P , depends  upon 
U “ 

a and  the  a priori  probabilities  of  and  which  are  assumed  to  be  equally  likely. 

P^  vs.  Rq  is  plotted  in  Fig.  1 for  several  values  of  slippage.  Since  Rq  is  an  integer 
these  curves  are  obtained  by  randomizing  the  decision  rule  where  the  threshold  T is 
interpreted  as 

r [T]wp  T - [T] 

T = 1 

"-[Tlwp  1-(T-[T]) 

where  [x-y]=  x and  [x-y]  = x+  1.  To  operate  in  the  vicinity  of  the  optimum  threshold 

Rg,  a supervised  stream  of  binary  digits  is  transmitted  and  used  to  adjust  the  threshold 

close  to  its  optimum  level  via  stochastic  approximation.  The  random  process  observed 

is  Y(x)  = If  1 where  the  event  [error,  x]  = occurrence  of  an  error  when  x is  the 

' ' [error,  xj 

threshold.  Note  that  because  of  the  peaked  nature  of  the  curves  in  Fig.  2 that  the 

4 

theorem  in  Section  A may  not  be  applied  although  the  KW  procedure  does  apply  and 

2 

guarantees  mean-square  convergence.  Burkholder  establishes  the  asymptotic  normal- 
ity of  the  KW  procedure  for  the  class  of  regression  functions  Mq  = [M(x)  : M : R -»  R s.  t. 
M(x)  increases  (decreases)  for  x < 0 and  decreases  (increases)  for  x > 0)  which  includes 


i 


I -i 


I 

i 

1 


iX  *i  j 
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REGRESSION  FUNCTION  M(»)  FOR 
THE  KIEFER -WOLFOWITZ  PROCEDURE 
F(»)  • tx+  l/Z)  + 

G(»)*F(x-A)  . a*  0.1.03.0  5,0.7 
A IS  THE  SLIPPAGE  PARAMETER 


M(X) 


Fig.  1.  Plot  of  the  K-W  regression  function 
for  the  adaptive  threshold  detector. 

the  regression  function  under  study.  For  M(x)  e he  establishes  the  existence  of  a 
function  p(e)  : R s.  t.  for  any  e >0,  (x  - p(e))(M(x  - e)  - M{x+  e))  > 0,  x ^ U(e)  and 
|p(e)  - G]  <e  for  each  e > 0.  Since  P = M(x)  = -^  + .S  ( y.  - Y.  , )(x  - i)  I r . -v  where 
is  the  slope  of  M(x)  for  XE(i,i+  1),  one  can  show  with  further  analysis  that  m(e)  = 
Rp+  e(Y^+  Y^)/(Yj  - Y^^)  where  Y^  and  Yj  are  the  slope  above  and  below  the  break  point 
Rq.  Using  the  standard  KW  recursion  with  = e » ” A/n,  2 = M^(p(e  ) - e ) - 

M^(|j(e)+e),  2o^  = Var(Y()i(E)  - e))  + Var(Y(p(E)  + E)),  Burkholder  shows  the 

2 2 

JH  (X^  - |ji(e ))  — N(0,  2a  A /{4a^  A - 1)) . Note  that  does  not  converge  to  R^,  but  to 
\j{c)  located  within  a distance  c of  Rq.  Simulation  of  this  procedure  using  a restricted 
parameter  space,  i.e.  , the  recursion  is  restricted  to  a few  integers  above  and  below 
Rq,  shows  that  three  thousand  samples  or  less  is  sufficient  to  locate  R^. 

The  KW  procedure  requires  two  samples  per  iteration  where  the  RM  procedure 

2 

requires  only  one.  By  choosing  Y =Ir  . i-If  •>,  G=|ji(e),  a = 

^ ^ ® n terror,  x + c ] terror,  x-eJ’  " 

Var  Y (|i(c)),  and  applying  the  theorem  of  Section  3 of  Ref.  1 with  constant  gain  coef- 

^ 2 2 
ficient  A one  concludes  that  a/iT  (X^  - u(e  ))  ^ N(0,  0 A /(2a^A-  1))  where  = Y^+Yj. 

Thus  the  RM  procedure  may  be  applied  as  well  and  simulation  using  a restricted 
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Fig.  2.  R-M  regression  function  for  the 
variable  threshold  detector. 

parameter  space  indicate  it  performs  better  than  the  KW  procedure.  The  regression 
function  for  e = l/2  and  e = l/4  for  a = 0.  7 is  plotted  in  Figure  2. 

The  rate  of  convergence  of  both  of  the  above  procedures  will  deterioriate  in  the 
presence  of  Burst  Noise.  Consider  what  happens  if  F(x),  instead  of  being  uniform,  is 
a mixture  distribution  of  the  form 

F(x)  = (1  - 6)[(x+  i/2]W  + I[i/2,  oo]W]  + 

2B  ^ ^[B, 

3 - 1 

where  B is  the  order  of  10  and  6 is  the  order  of  10  . Since  the  reference  sample  is 
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used  over  many  decision  intervals,  if  is  due  to  a Burst  Noise  sample,  all  deci- 

sions based  on  this  reference  sample  would  be  determined  solely  by  the  Burst  Noise. 
Therefore,  great  pains  must  be  taken  to  eliminate  any  Burst  Noise  samples  from  the 
reference  sample  and  this  can  easily  be  done  with  proper  filtering  during  the  period 
when  this  sample  is  being  collected.  During  the  decision  interval  a filter  similar  to 
that  in  Fig.  3,  where  is  the  maximum  anticipated  slippage  parameter  will  improve 

convergence.  One  must  be  careful  to  note  that  introduction  of  a filter  will  change  the 
shape  of  the  regression  curves  and  sometimes  the  optimum  threshold.  Optimum  filter 
design  is  a problem  that  must  be  tailored  to  a specific  application  and  does  not  concern 
us  here.  In  simulation  an  interesting  phenomenon  was  observed.  Since  the  regression 
function  for  the  RM  procedure  for  e = l/4  looks  like  a sequence  of  light  limiters  stacked 
upon  one  another  (Fig.  2),  one  suspects  that  the  RM  procedure  may  get  temporarily 
"hung  up"  on  the  wrong  "ledge.  " This  has  been  observed  for  the  small  sample  size 
mentioned.  However,  the  Burst  Noise  when  introduced  acts  as  either,  exercising  the 
system  and  pushing  the  recursion  to  the  correct  "ledge.  " In  this  case  one  could  view 
the  regression  function  as  having  its  own  "built  in"  filter. 
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RECURSIVE  FACTOR  ANALYSIS  METHODS  IN  FEATURE  EXTRACTION  PROBLEMS 
L.  Kurz  and  C.S.  Yoon 

The  problem  of  data  reduction,  feature  extraction  and  pattern  recognition  using 
factor  analysis  techniques  were  studied  in  References  1 and  2.  In  particular,  stochastic 
approximation  techniques  were  applied  to  an  iterative  version  of  the  principal  factor 
method  after  estimating  the  unique  factors  by  ad  hoc  rules.  Subsequently,  this  proced- 
ure was  applied  to  the  classification  of  line  drawings  by  applying  it  to  a precoded  data. 

The  basic  problem  considered  here  is  to  find  the  estimates  of  the  factor  loading 
matrix.  A,  and  unique  matrix,  ^ , based  on  all  of  n partitioned  observations  with  the 
assumption  that  the  underlying  statistical  parameters  undergo  slow  change.  This 
assumption  is  realistic  for  many  useful  data  sets  such  as  sleep -stage  electro-encephalo- 
grams or  record  of  long  duration  survey  of  earth  resources.  In  particular,  maximimi 
likelihood  recursive  methodology  is  developed  to  find  intuitively  appealing  estimates  of 
unique  factors. 

One  of  the  strategies  to  handle  the  problem  is  to  divide  the  data  evenly  into  k 

sections  and  to  process  each  section  one  at  a time.  It  is  assumed  that  the  sample  co- 

variance  Q(T  ),  based  on  each  section  of  the  data  T , is  given  by 
n “ 

Q(Tj  = Q(Tj.T2.---,Tj^)+a^ 


where  a is  a px  p random  matrix,  and  the  adaptive  sample  covariance  matrix 
n 

Q(Ti,  • • • ,T^),  based  on  the  data  (T^.T^.  • • • , T J is  given  by 

where  (3  is  a pxp  random  matrix.  It  is  intuitively  appealing  to  conclude  that 
n ^ 


E[a  ].. 
^ n-'tj 


E[p 

n -‘ij 


where  [a  1. . and  [3  1. . are  the  ij-th  elements  of  a and  3 , respectively.  The  approach 
* ^ L ‘■'^n  ij  2 n n 

to  this  partitioned  data  is  to  estimate  D by  the  maximum  likelihood  method  and  to 
estimate  A by  a stochastic  approximation  procedure. 

A.  Estimation  of  the  Factor  Loading  Matrix  by  Stochastic  Approximation 

The  problem  considered  in  this  section  is  to  find  the  estimates  A and  & based  on 
all  of  n observations  by  iterative  procedures.  There  might  be  several  ways  to  estimate 
them.  Two  new  methods  are  proposed  in  this  section.  The  first  method  utilizes  itera- 
tive algorithm  to  obtain  a sample  covariance  matrix  based  on  the  whole  data  and  applies 
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Jennrich's  algorithm  to  find  the  estimate.  A,  of  A.  In  the  second  method,  Robbins- 

Monro  type  stochastic  approximation  algorithm  is  used  to  estimate  D and  A after  the 

...  2 — — 
initial  value  and  Aj  are  chosen.  The  algorithm  based  on  the  stochastic  approxima- 
tion has  been  proved  to  converge  and  this  convergence  can  be  accelerated  using  various 
schemes. 

1 . Method  1 

The  sample  mean  vector  M^  based  on  the  first  section  of  data  is 
n , 


M 


I * 

, = M(T.)  = — y X. . 
1 ' 1 ' n , ^ 11 

1 i=l  •' 


(1) 


where  n^  is  the  number  of  observation  in  the  first  section  T^. 
i = 1 , 2,  • • • , n represents  measurement  and 

j = 1,2,  • • • ,Nj  represents  individual  samples  to  be  classified. 

The  mean  vector  of  Eq.  (1)  is  updated  every  time  when  a new  section  of  data 
arrives  at  the  processor.  At  the  second  stage  the  mean  vector  is  given  by 


M 


2 = M(T,,T2)  = {n,  M(T,)  t „2M(T2)1 


(2) 


and  the  updated  mean  vector  is  an  unbiased  estimate  of  the  population  mean.  Thus,  the 
updated  mean  M^  is  calculated  using  the  previous  mean  M^  j by 

M^  = M(Tj,T2,  • • • - • • • , n^  { ("l  + ^2  + • • • . _ ^ + n^  M(T^  )} 


i = 2,3,---,k 

The  sample  covariance  matrix,  Q(Tj),  based  on  the  section  Tj  is 


(3) 


1 "^1  n 

Q,  = Q(T.)  = y (x.  -M,)'  = — V X.  X.' ^-M,m; 

1 1 n.-l'^  i l''i  1'  n,-l— < 11  n,-l  1 1 

1 i=  1 1 i=  1 1 


(4) 


In  general,  the  sample  covariance  matrix  based  on  K observations  is  given  by 

Q = r-i-j-  V X.  X.'-i-^MM' 
k - 1 ^ 11  k - 1 

i=l 


(5) 


and  it  can  be  calculated  iteratively  by  using 
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» N' 


N . , 

, .m: 


N - 1 _i  ”i"i  N_.  , - 1 n+1  n+1 


n+1  i=l 
N . . - I 


n+  1 


N 


n+1  r n ^ J.  1 V X.  x'  + =T-^  mm'  - M . . M.'  , , (6) 

= TT  N '^n  N . -<  i i N ^ , n n n+1  n+  IJ 

^n+1  *^^n+l  " n+1  i=Nn+l  ‘ 

It  is  easy  to  show  the  unbiasedness  of  the  updated  covariance  given  the  algorithm  in 
Equation  (6).  Namely, 

EtQll= 

Efn  ] I - Q + -^  + ■ ^2^2} 

EtQz^-N^-l  t ^l"N2i=N-;  + l ^2  ' ' ' 


N, 


, '1 

Tri  V X v'  1 

^ E \ 

— r ’ X.  X.  f 

1 1-1  1 1 j 

- 1 1 

i=l 

2 

N., 

, ^ 

, > 

N. 


_ V 

By  using  the  iterative  algorithm  in  Eqs.  (6)  and  (3),  a significant  computational  efficiency 
can  be  achieved,  especially  if  storage  capability  of  the  processor  is  restricted. 

4 

With  the  unique  factors  decided  by  one  of  the  ad  hoc  methods  discussed  in  Harman 
and  the  final  updated  covariance  matrix.  Qj^.  the  factor  loading  matrix,  A based  on  the 
whole  observations  can  he  calculated  by  one  of  the  estimation  methods  such  as  the 
principal  factor  method,  centroid  method,  minres  method  or  maximum  likeUhood  method 

2.  Method  2 

When  the  first  section  arrives,  the  sample  covariance  is  calculated  and  the 
unique  factor  is  estimated  by  the  maximum  likelihood  estimation  method.  While 
D j is  estimated,  A^  is  calculated  in  such  a way  that 

(Qi  - £?)  = Ai^; 

Everytime  when  the  new  section  of  observations  arrives,  the  sample  covariance  is  up- 
dated and  the  matrix  A is  also  updated  until  the  final  factor  loading  matrix,  based  on 
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all  observations,  is  obtained.  The  stochastic  approximation  procedure  is  applied  to 
obtain  the  final  matrix  A with  A^  and  as  initial  values.  One  may  develop  three  sets 
of  recursive  procedures  to  estimate  A and  D from  the  relation, 

(9  - D^)A  - AJ  = 0 (6a) 

where 

J = a' A 

which  was  derived  using  the  principal  factor  method,  and  another  relation 

= diag  (Q  - AA')  (7) 

which  was  obtained  from  the  maximum  likelihood  estimation  method.  The  usual  RM 
type  recursions 


A ^ = A + - [(Q  - D^)A  - A J ] 

— n+1  — n n — n — n — n — n— n 

+ - [diag  (Q  - A a'  - D^)] 
—n+1  — n n '■  ® — n — n — n — n 

are  not  suitable  in  this  problem  because 


'9l  - J 1 = ° 


(8a) 

(8b) 


(9) 


Since  it  was  assumed  that  the  parameters  of  distribution  of  the  data  are  slowly  varying, 
one  may  consider  that 


— n+  1 


Q 
— n 


6 + 
— n 


(10) 


n = 1 , 2, 3,  • • • 

where  6 is  a zero-mean  error  matrix  and  each  element 


[e]..  « [Q  ].. 
IJ  n IJ 


will  be  replaced  by  9n+l  improvement  part  of  the  iteration.  It  is  pos- 

sible then  to  generate  the  RM  type  recursions  to  update  a factor  loading  matrix  and  a 
unique  factor  matrix. 


At  first,  the  RM  type  recursive  equation  can  be  developed  from  the  relations  ob- 
tained by  the  principal  factor  method.  Suppose  that  an  updated  covariance  matrix, 

J is  available,  then  a factor  loading  matrix,  A^  is  updated  by 


A^,=A  + — [(Q^,-D^)A  - AJ] 

— n+1  — n n —n+1  — n — n — n— n 


(Ha) 
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i 


Iv 


where  (Cj)  is  a gain  sequence,  and  the  unique  factor  matrix  is 


D 


n+1  = diag(Q^^^  - 


(11b) 


Secondly,  a recursive  version  can  be  constructed  from  the  equation  developed 
from  the  maximum  likelihood  equation  and 


+ — [diag  (Q 
— n+1  — n n “ — 


„ - A a'  - D'^)] 

-n+1  -n-n  -n'"' 


(12a) 


-£n+l 


'-n+1 


-n+1  -n+1 


(12b) 


are  obtained.  Since  eigenvectors  should  be  calculated  repeatedly  in  Eq.  (21b),  this  set 
of  equations  is  not  computationally  useful. 

In  view  of  that  fact,  a new  procedure  is  used.  The  recursive  procedures  generat- 
ed from  both  methods  can  be  used  to  update  two  matrices.  The  unique  factor  matrix 
is  updated  by 


Q 

+ — [diag  (Q  , , - A A'  - D^)] 
— n+1  — n n ^ n+1  -n-n  — n 

then  the  factor  loading  matrix  is  improved  by 


-n+ 


= A +— ^-[(Q.,-D^,,)A  - AJ] 

1 — n n —n+l  — n+1  — n — n— n 


(13a) 


(13b) 


The  convergence  of  sequence,  A^  generated  by  the  algorithm  in  Eq.  (11)  will  be 
discussed  in  Theorem  1 and  the  one  generated  by  algorithm  Eq . (13)  will  be  discussed 
in  Theorem  2. 

Theorem  1:  Suppose  that  the  observed  random  variables  X have  bounded  covari- 
ance matrices  Q,  , Q,.  • ■ • , and  each  of  them  is  a covariance  matrix  of  a factor  analysis 

— “ 1 “ t 2 

model.  Let  us  choose  /i  j and  such  that 


1°,  -£?)'A,a; 

The  sequence  of  estimators  [aJ  generated  by  the  algorithm  Eq.  (11a)  converges  to  the 
A satisfying  Eq.  (3)  with  probability  1. 

Proof;  See  Reference  5. 

2 

Theorem  2:  Suppose  that  the  assumptions  in  Theorem  1 hold  and  generated 
by  the  algorithm  in  Eq.  (13a)  has  an  asymptotically  bounded  variance,  then  the  sequence 
of  estimators  { A^]  generated  by  the  algorithm  in  Eq . (13b)  converges  to  the  matrix  A 
satisfying  Eq.  (3)  with  probability  1. 


i 

a 

\ 

! 


i 


i 


f 

I 


h-'iiai  - ' - 
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Proof:  See  Reference  5. 


Variance  of  D 
— n 


Since  the  variance  of  D is  an  important  factor  to  prove  convergence  of  the  algo- 
2 

rithm  in  Theorem  2 and  is  generated  iteratively  by  the  algorithm  in  Eq.  (13a),  the 
study  of  the  asymptotic  behavior  of  the  variance  of  is  of  interest.  This  behavior 
yields  information  about  the  quality  of  the  estimator. 

2 

Since  and  the  term  in  the  brackets  of  the  algorithm  in  Eq.  (13a)  are  diagonal 

matrices,  they  can  be  considered  as  vectors.  The  vector  version  of  Sacks'  asymptotic 
6 T 

normality  theorem”’  could  be  used  to  find  the  asymptotic  variance  since  the  form  of 

the  algorithm  fits  Sacks'  iteration  scheme.  To  avoid  unnecessary  complexities,  a 

2 

scalar  version  for  each  component  will  be  used.  The  components  of  3^re 

iteratively  computed  by 

b . , K , 

d = d . + i(q  - y (a  - D . 1 (14) 

n+1,1  n,i  n I ^n+1  xl  — ' n ij  n,iJ 

j=l 

i = 1 , 2,  3,  • • • ,N  n = 1 , 2,  • • • , 

where  (q  , ,)..  is  the  i-th  diagonal  term  of  the  covariance  matrix  Q . , . Thus,  n scalar 
'^n+1  11  ® — n+1 

estimators  are  generated  simultaneously,  yielding  the  components  of  the  vector  estima- 


Theorem  3:  Suppose  that  the  i-th  component  of  D_  is  estimated  by  the  algorithm 

2 

in  Eq.  (14)  with  initial  values  D ^ and  A^  obtained  by  one  of  factor  loading  estimation 
methods.  If  the  following  conditions  are  satisfied  let 

Y (d  .,  d.)  = M(d  .,  d.)  + Z (d  .,  d.) 
n'  n,i’  i'  ' n,i’  i'  n'  n,i’  i' 

where  E[Z(d  .,  d.)]  = 0 and  M(d  . , d. ) has  a unique  root  at  d . = d. . 

n,ii  n,li  n,ii 

Condition  (1) 

(d  ,-d.)M(d  . - d.)  > 0 for  all  d . ^ d. 

' n,i  1 n,i  i'  n,i  ' i 

Condition  (2) 


Kid  ,-d.  I<lM(d  .,d.)l<K,  Id  ,-d,  1 
‘ n,i  i‘  — * n,x*  1 ‘ — 1'  n,i  i' 


Condition  (3) 
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Condition  (4) 

a)  sup  E{Y(d.')  - M(d.')}^  < or 

b)  lim  E[Y(d.')  - M(d.')]^  = 
d'-di 

Condition  (5) 

sup  E [Y(d')  - Mid')]^'*'^  < 
ld--dil<c 


for  some  c > 0,  v > 0 

then  the  sequence  of  estimators,  d^^,  converges  to  d^^  with  probability  one  and  its  asymp- 
totic variance  is  given  by 


lim 

n-^oo 


var(d  .) 
ni 


2^2 
O b. 

1 

n(2b.  - 1) 
1 


Proof:  See  Reference  5. 

Thus,  two  sets  of  iterative  algorithms  have  been  developed  to  extract  a factor 
loading  matrix  from  a continuous  flow  of  data.  The  well  developed  stochastic  approxi- 
mation algorithm  was  applied  to  maximum -likelihood  estimation  of  the  factor  loading 
matrix.  By  proper  selection  of  gain  constants  and  weighing  functions,  efficient  operation 
of  the  algorithms  is  assured. 
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NONPARAMETRIC  PROBABILITY  DENSITY  ESTIMATION 
I.M.  Habib  and  L.  Kurz 

The  estimation  of  probability  density  functions  (p.d.f.  's)  and  related  problems 
had  always  been  of  major  interest  in  processing  of  computed  data.  The  problem  has 
additional  attraction  because  once  it  is  solved,  related  problems  of  channel  condition 
monitoring,  design  of  adaptive  detectors,  selection  of  optimum  scores  for  nonpara- 
metric  detectors  become  easily  tractable.  The  determination  of  an  optimal  in  some 
sense  p.d.f.  that  fits  a set  of  independent  identically  distributed  samples  (i.i.d.)  has 
been  considered  by  many  researchers.^  The  approach  taken  by  the  investigators  was 

to  obtain  robust  and  nonparametric  procedures  for  estimation  of  unknown  p.d.f.  How- 
ever, these  methods  resulted  in  estimators  which  fit  a certain  p.d.f.  quite  well  and  do 
not  fit  other  p.d.f.  's.  Essentially,  these  procedures  are  nonparametric  but  not  robust. 

9 

The  method  described  here  is  based  on  the  philosophy  suggested  by  Cencov. 

There  are  two  basic  differences  in  the  approach  taken.  Instead  of  unspecified  ortho- 
normal sets,  three  specific  orthonormal  sets  of  functions  are  used  which  result  in  ex- 
cellent estimators  for  three  classes  of  distributions;  near -gaussian,  clipped  impulsive 
noise  and  skewed.  The  choice  of  the  estimation  procedure  in  a particular  application 
results  in  efficient  use  of  computation  time  and  minimum  number  of  terms  in  the  ex- 
pansion for  a given  mean -square  error  of  estimation.  The  procedures  presented  com- 
pare favorably  with  the  most  efficient  published  results. 

A.  Formulation  of  the  Estimation  Problem 

The  estimate  of  the  unknown  p.  d.  f.  , f(x),  based  on  N i.i.  d.  samples  is  of  the  form 

i=  1 

where 

a.j^  , i = 1,  • • • ,r  are  the  expansion  coefficients 

e.(x)  , i = 1,  • • • , r are  the  appropriately  preselected  sets  of  orthonormal  functions 
If  the  error 

e = / [f(x)  - Yj  0^(x)]^dx  (2) 

X i=l 

is  to  be  minimized  then  the  coefficients  a^j^  must  satisfy 

E[0.(x)  - a.j^l  = 0 , i = 1,  • • • ,r  (3) 


■ « 
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To  estimate  a.j^,  a Robbins -Monro  one -dimensional  stochastic  approximation  algorithm 
is  used.  In  particular,  the  recursion  relation  is 


•S*"  “ 4’ * - “Sn > ■ 


(4) 


where 


Yu  = 


k k+  1 

and  a<!^?  converges  to  a...  in  the  mean-square  sense  and  with  probability  one. 
iN 

B.  Gaussian  and  Near -Gaussian  Distributions 

For  this  class  of  distributions,  the  best  set  of  orthogonal  functions  is  the  set  of 
Hermite  polynomials  defined  by 


H.(x)  = (-l)"e^^ 

* dx^ 


with  the  orthogonality  equations 

00  2 ■ 

( e'’^  H.(x)  H.(x)dx  = 2 ilir  6.. 

1 J 


(5) 


(6) 


where  6 . is  the  Kronecker  delta.  Introducing 
ij 


c.  = a.-,  2 il  ir 
1 iN 

Equation  (4)  reduces  to 

i 1 1 * 

and  the  estimator  of  the  unknown  p.d.f.  is 


(7) 


(8) 


■fix)  • I “i'*’ 


(9) 


i=l 


The  procedure  was  applied  to  a contaminated  sample  of  size  100  with  a nominal  distribu- 
tion N(0,  1).  The  solution  was  carried  out  on  a Univac  digital  computer  and  the  results 
are  shown  in  Figure  1 . The  fit  is  perfect  and  much  better  than  using  the  method  sug- 
gested by  Good  and  Gaskins*^  if  the  comparison  is  based  on  fixed  computation  time. 


^ i 
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Fig.  1.  p.d.f.  estimator  for  a N(0,  1)  distribution. 

C.  Clipped  Impulsive  Noise  Distributions 

For  this  class  of  distribution  the  suitable  set  of  orthonormal  function  was  found 
to  be  the  set  of  prolate  spheroidal  wave  functions  {ili^{c,x)]  which  satisfy  the  relations 


f 4j.(c,x)  4;  (c.x)  dx  = 6.  . 
-or  J 


(10) 


and 


,T/2 


f d).(c,x)  ib.(c,x)dx  = X.6.. 
-T/2  ^ J 1 tJ 


(11) 


where  are  the  eigenvalues  corresponding  to  the  eigenfunctions  {4>^(c,x)}.  Introducing 


DATA  PROCESSING  589 

and  proceeding  as  in  Section  B,  the  estimator  of  p.d.f.  is  then 
. r 

£(x)  ^ ^ a.j^  'J>.(c,x)  (13) 

i=l 

The  procedure  was  applied  to  a sample  of  size  100  drawn  from  the  output  of  a hard 
limiter  represented  by 

T/2 

~ f sgn[y(t)]dt  (14) 

^ -T/2 

The  input  to  the  limiter,  y(t),  consisted  of  large  variance  impulsive  noise  represented 
in  time  by 

p(„,T)  . 

where  a is  the  expected  number  of  zero  crossings  per  unit  time  and  n is  the  number  of 

1 8 

zero  crossings  in  an  interval  of  length  T.  McFadden  gives  the  exact  expression  for 
f(x^)  as 


f(x^)  = ^ 


^T.2  ^ ^ 


1 -(x„/T)^ 


+ — 2 — - T)  + 6(x^+  T)]  for  |x^l  < T (16) 

= 0 for  |x^l  > T 

where  Iq(')  and  Ij(')  are  modified  Bessel  functions  of  the  first  kind,  or  order  zero  and 
one,  respectively.  ForT=10,  a =.5,  c=8,  r=8,  the  comparison  of  estimated  and 
exact  p.d.f.  's  is  shown  in  Figure  2.  The  discrete  portion  of  the  distribution  is  omitted. 

D.  Skewed  Distributions 

For  this  class  of  distributions  the  set  of  orthonormal  Laguerre  functions  is  appli- 
cable. The  Laguerre  functions  are  given  by 


T (t-v-  P5^f(2p)"  n n(2p)"'  n-1  n(n-l)(2p)"‘  n- 

L“ni“  ^ - ln-l)l  ’ ^ ^ 2-(n-2)! ^ 


n(n-l)(n-2)(2p)"'^  n-3 
3-(n-3)l 


+ •••  +(-l)"]e-P^ 
n = 0,  1 , 2,  • • • and  0 < T < oo 


■ • -N  • 
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In  this  case  Eqs.  (4)  and  (1)  are  applicable  with  0^j^(x)  = L^(x)  as  an  example,  the  esti- 
mation problem  was  solved  for  a sample  size  10  0 from  a Beta  distribution  with  the 
p.d.f.  specified  by 


^(5^)  = • 0£xll.  a>-l,  P>-1 


= 0 


elsewhere 


For  P=  1,  a=  2,  r=  30  and  p=  . 5,  the  results  of  the  estimation  are  shown  in  Figure  3. 
Joint  Services  Technical  Advisory  Committee 
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